scanf – C Examples

Reading data from standard input can be achieved using scanf. Scanf allows the user to read data as a specific format, in c this includes string, char, int, double, hexadecimal characters.

This post will outline the basic uses of scanf, and how to read data based on the format you need. Alternatively, if you are curious how to output different variable types I have an article for using fprintf.

scanf definition

The definition of the scanf function is:

int scanf(const char *format, ...);

Using scanf

Okay, now you see the definition of scanf, how do you use it?! The format argument is the most important piece of this puzzle. If you are requesting user input as say someones name, it would make sense that this is a string or character array, and thus it would make sense to use %s indicating you are expecting a string. Alternatively, if you know you are reading in a decimal value the format specifier should probably be %d for decimal.

Lets get down to business. Here is a little program that reads in various inputs using scanf including scanf for a string, scanf for a char and a few others. This example includes the use of fprintf.

Note: The whitespace character with the scanf function will read and ignore any whitespace characters (this includes blank spaces and the newline and tab characters) which are encountered before the next non-whitespace character. This includes any quantity of whitespace characters, or none.

#include <stdio.h>
 
int main(int argc, char **argv)
{
  char input_array[100];
  int i; 
  double d;
 
  fprintf(stdout, "Enter your name: ");
  scanf("%s",input_array);  
 
  fprintf(stdout, "Enter a decimal value between 1 and 100: ");
  scanf("%d",&i);
 
  fprintf(stdout, "Name: %s , %d is between 1 and 100.\n",input_array,i);
 
  fprintf(stdout, "Enter a hexadecimal number: ");
  scanf("%x",&i);
  fprintf(stdout, "You have entered %#x (%d).\n",i,i);
 
  return 0;
}

Compiling and running the program we get:

erik@debian:~/scanf_ex$ ./scanf_ex
Enter your name: Erik
Enter a decimal value between 1 and 100: 45
Name: Erik , 45 is between 1 and 100.
Enter a hexadecimal number: beef
You have entered 0xbeef (48879).

Voila, not much too it really. The important thing to note is the use of the format specifier.

scanf Format Specifiers

The power of scanf lays in its ability to read in input as a specific format. This is especially useful in C because C has so many types.

Format specifiers as outlined in the scanf manual page:

h – Indicates that the conversion will be one of diouxX or n and the next pointer is a pointer to a short int or unsigned short int (rather than int).

hh – As for h, but the next pointer is a pointer to a signed char or unsigned char.

j – As for h, but the next pointer is a pointer to a intmax_t or uintmax_t. This modifier was introduced in C99.

l – Indicates either that the conversion will be one of diouxX or n and the next pointer is a pointer to a long int or unsigned long int (rather than int), or that the conversion will be one of efg and the next pointer is a pointer to double (rather than float). Specifying two l characters is equivalent to L. If used with %c or %s the corresponding parameter is considered as a pointer to a wide character or wide character string respectively.

L – Indicates that the conversion will be either efg and the next pointer is a pointer to long double or the conversion will be dioux and the next pointer is a pointer to long long.

q – equivalent to L. This specifier does not exist in ANSI C.

t – As for h, but the next pointer is a pointer to a ptrdiff_t. This modifier was introduced in C99.

z – As for h, but the next pointer is a pointer to a size_t. This modifier was introduced in C99.

The following conversion specifiers are available:

% – Matches a literal %. That is, %% in the format string matches a single input % character. No conversion is done, and assignment does not occur.

d – Matches an optionally signed decimal integer; the next pointer must be a pointer to int.

D – Equivalent to ld; this exists only for backwards compatibility. (Note: thus only in libc4. In libc5 and glibc the %D is silently ignored, causing old programs to fail mysteriously.)

i – Matches an optionally signed integer; the next pointer must be a pointer to int. The integer is read in base 16 if it begins with 0x or 0X, in base 8 if it begins with 0, and in base 10 otherwise. Only characters that correspond to the base are used.

o – Matches an unsigned octal integer; the next pointer must be a pointer to unsigned int.

u – Matches an unsigned decimal integer; the next pointer must be a pointer to unsigned int.

x – Matches an unsigned hexadecimal integer; the next pointer must be a pointer to unsigned int.

X – Equivalent to x.

f – Matches an optionally signed floating-point number; the next pointer must be a pointer to float.

e – Equivalent to f.

g – Equivalent to f.

E– Equivalent to f.

a – (C99) Equivalent to f.

s – Matches a sequence of non-white-space characters; the next pointer must be a pointer to character array that is long enough to hold the input sequence and the terminating null character (\0), which is added automatically. The input string stops at white space or at the maximum field width, whichever occurs first.

c – Matches a sequence of characters whose length is specified by the maximum field width (default 1); the next pointer must be a pointer to char, and there must be enough room for all the characters (no terminating null byte is added). The usual skip of leading white space is suppressed. To skip white space first, use an explicit space in the format.

scanf Return Value

These functions return the number of input items successfully matched and assigned, which can be fewer than provided for, or even zero in the event of an early matching failure.

The value EOF is returned if the end of input is reached before either the first successful conversion or a matching failure occurs. EOF is also returned if a read error occurs, errno is set if this occurs.

Tokenize A String In C With ‘strtok’

C is not known for its ability to manipulate strings in any easy manner. Other languages such as Perl or PHP are better designed to handle string functions because of a lack of types. However, there are a few functions that do make string manipulation in C a lot easier; one of those being the strtok function. strtok is used to tokenize or split a string on a user specified delimiter(s). This can be handy for parsing comma separated value files (csv), or any type of commonly formatted data strings or files.

The strtok function is unlike other C functions in that its initial call takes the string as an argument, but the subsequent calls to strtok take NULL as the string. This is because strtok has a handler behind the scenes that does all the dirty work of tokenizing the string. How is strtok defined?

C ‘strtok’ Definition

strtok is defined as:

// Header file to include
#include <string.h>
 
// function definition
char *strtok(char *str, const char *delim);

As shown above, strtok takes two arguments. The first argument “str” is the string we would like to tokenize or break into pieces specified by our delimiter. The second argument, “delim” is the string or single value we would like strtok to use to split on. This may sound somewhat confusing, and thus I have an example:

Example Of Using ‘strtok’ On A C String

#include <stdio.h>
#include <string.h>
 
// Delimiter values
#define DELIM " ,.-+"
 
int main (int argc, char **argv)
{
  char str_to_tokenize[] = "- Strtok is meant for - breaking up, strings with funny values. + 5";
  char *str_ptr;
 
  fprintf(stdout, "Split \"%s\" into tokens:\n", str_to_tokenize);
 
  str_ptr = strtok(str_to_tokenize, DELIM);
  for(; str_ptr != NULL ;){
    fprintf(stdout, "%s\n", str_ptr);
    str_ptr = strtok(NULL, DELIM);
  }
 
  return 0;
}

Output from running the above example program:

$ ./token 
Split "- Strtok is meant for - breaking up, strings with funny values. + 5" into tokens:
Strtok
is
meant
for
breaking
up
strings
with
funny
values
5

A few things to note are, the DELIM #define I created is actually a string of basic values I inform strtok to split on. The string I used to tokenize contained spaces, dashes, commas, a period and an addition sign. strtok used all of those values to break up my string as shown by the output above. In a comma separated case (csv) your delimiter would most likely only be a comma. Another piece to note is within the for loop at line 17 in the example code, I pass strtok(NULL, DELIM). The NULL is very important, as behind the scenes strtok has a pointer to the remaining string and knows to continue working on it through each iteration. Once strtok has completed its work, it returns NULL, which is what I used as my conditional in the for loop.

Rather than simply printing out each tokenized piece you could process and use it, or store it in array, whatever needs to be done with each data piece. Now you don’t have to be afraid of string parsing in C!

Ranges in C Switch Statements

Whenever possible I will try and use a switch statement instead of a set of if/else blocks. I do this for a few reasons:

  1. A switch construct is more easily translated into a jump (or branch) table. This can make switch statements much more efficient than if-else when the case labels are close together. The idea is to place a bunch of jump instructions sequentially in memory and then add the value to the program counter. This replaces a sequence of comparison instructions with an add operation.
  2. I find this provides cleaner in code, this is beneficial in the long run for legacy code and general upkeep.

One drawback of switch statements…at least I used to think was the ability to specify a range of values within each case statement. For example in an if statement you can do something like this:

if(x > 3 && x < 8)

Is it possible to do so with a switch statement? It turns out if you are using GNU C then there is an extension that provides the ability to specify a range in your switch statements. An excerpt from the GCC help pages:

GNU C provides several language features not found in ISO standard C. (The -pedantic option directs GCC to print a warning message if any of these features is used.) To test for the availability of these features in conditional compilation, check for a predefined macro __GNUC__, which is always defined under GCC.

Using the ellipsis in a switch statement

To utilize a range within a case of your switch statement you can use ellipsis or “…”. This informs the compiler to check the variable against two boundaries, a minimum and maximum. I thought I better test this extension with a small program before I actually use it. This program will read in one integer value using scanf then this value is sent to the switch statement which outputs a basic string identifying where the case has been matched.

#include <stdio.h>
 
int main() {
    int x;
    scanf("%d", &x);
 
    switch (x) {
       case 1 ... 100:
           printf("1 <= %d <= 100\n", x);
           break;
       case 101 ... 200:
           printf("101 <= %d <= 200\n", x);
           break;
       default:
            break;
    }
    return 0;    
}

The corresponding program output:

$ ./main
3
1 <= 3 <= 100
 
$ ./main
121
101 <= 121 <= 200

This little test program verifies that basic values will match their case equivalent. I would suggest compiling with -pedantic to ensure your compiler supports this extension before assuming it actually works. Pretty great nonetheless.

Set Interface IP Address From C In Linux

There are a lot of cool things you can do from a C program. One of them is setting an interface’s IP address using ioctl calls. You can of course, set it the old fashioned way using ifconfig, which I wrote about earlier. Rather than having to execute a system call to ifconfig from your C process, you can actually set the interface IP address using an input/output control call (ioctl) much the same way that ifconfig does itself. Holy crap Batman!

It goes to say also, if you can set the IP address then you may also read it via an ioctl call. The example I have crafted includes the function call which takes two parameters, the first is the interface name, the second is the IP address in normal form (192.168.1.1) which is then converted to binary. This function will also bring the interface up prior to setting the IP address. Lets take a look:

#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <sys/socket.h>
#include <net/if.h>
#include <net/if_arp.h>
#include <sys/ioctl.h>
#include <linux/sockios.h>
#include <errno.h>
#include <netinet/in.h>
#if defined(__GLIBC__) && __GLIBC__ >=2 && __GLIBC_MINOR__ >= 1
#include <netpacket/packet.h>
#include <net/ethernet.h>
#else
#include <sys/types.h>
#include <netinet/if_ether.h>
#endif
 
int set_ip(char *iface_name, char *ip_addr)
{
	if(!iface_name)
		return -1;	
 
	int sockfd;
	struct ifreq ifr;
	struct sockaddr_in sin;
 
	sockfd = socket(AF_INET, SOCK_DGRAM, 0);
	if(sockfd == -1){
		fprintf(stderr, "Could not get socket.\n");
		return -1;
	}
 
	/* get interface name */
	strncpy(ifr.ifr_name, iface_name, IFNAMSIZ);
 
	/* Read interface flags */
	if (ioctl(sockfd, SIOCGIFFLAGS, &ifr) < 0) {
		fprintf(stderr, "ifdown: shutdown ");
		perror(ifr.ifr_name);
		return -1;
	}
 
	/*
	* Expected in <net/if.h> according to
	* "UNIX Network Programming".
	*/
	#ifdef ifr_flags
	# define IRFFLAGS       ifr_flags
	#else   /* Present on kFreeBSD */
	# define IRFFLAGS       ifr_flagshigh
	#endif
 
	// If interface is down, bring it up
	if (!(ifr.IRFFLAGS & IFF_UP)) {
		fprintf(stdout, "Device is currently down..setting up.-- %u\n",ifr.IRFFLAGS);
		ifr.IRFFLAGS |= IFF_UP;
		if (ioctl(sockfd, SIOCSIFFLAGS, &ifr) < 0) {
			fprintf(stderr, "ifup: failed ");
			perror(ifr.ifr_name);
			return -1;
		}
	}
 
	sin.sin_family = AF_INET;
 
	// Convert IP from numbers and dots to binary notation
	inet_aton(ip_addr,&sin.sin_addr.s_addr);	
	memcpy(&ifr.ifr_addr, &sin, sizeof(struct sockaddr));	
 
	// Set interface address
	if (ioctl(sockfd, SIOCSIFADDR, &ifr) < 0) {
		fprintf(stderr, "Cannot set IP address. ");
		perror(ifr.ifr_name);
		return -1;
	}	
	#undef IRFFLAGS		
 
	return 0;
}
 
void usage()
{
	const char *usage = {
		"./set_ip [interface] [ip address]\n"
	};
	fprintf(stderr,"%s",usage);
}
 
int main(int argc, char **argv)
{
	if(argc < 3){
		usage();
		return -1;
	}
 
	set_ip(argv[1],argv[2]);
 
	return 0;
}

Lets check if it works. I will do an ifconfig on eth1 to see its current IP address, then use the program above to set it to a new IP address, then use ifconfig again to view it.

$ ifconfig eth1
eth1      Link encap:Ethernet  HWaddr 00:1b:21:0a:d2:cf  
          inet addr:192.168.5.12  Bcast:192.168.5.255  Mask:255.255.255.0
          inet6 addr: fe80::21b:21ff:fe0a:d2cf/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:2690 errors:0 dropped:0 overruns:0 frame:0
          TX packets:14732 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:100 
          RX bytes:516826 (516.8 KB)  TX bytes:2242645 (2.2 MB)
 
$ sudo ./set_ip eth1 12.13.14.15
Device is currently down..setting up.-- 4163
$ ifconfig eth1
eth1      Link encap:Ethernet  HWaddr 00:1b:21:0a:d2:cf  
          inet addr:12.13.14.15  Bcast:12.255.255.255  Mask:255.0.0.0
          inet6 addr: fe80::21b:21ff:fe0a:d2cf/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:2690 errors:0 dropped:0 overruns:0 frame:0
          TX packets:14742 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:100 
          RX bytes:516826 (516.8 KB)  TX bytes:2247309 (2.2 MB)

Because we are modifying an interfaces IP address, we must run this process as root or with the use of sudo. Looks like it works! Of course you may want to set your IP address to something more useful than this. Also not that I do no error checking on user input, you may wish to take that into account if your input comes from an untrusted source.

Change Timestamp In PCAP File With C

A while back I needed to update a pcap file with about 20,000 packets in it. Each packet needed to have its timestamp essentially one millisecond after the other. My initial thought was to just get a hex editor and modify the packet, until I realized there were 20,000 packets in this pcap file, and the pcap was over 60 megabytes of packet data. Luckily I found the pcap API that provide a set of functions to modify the pcap in an offline mode. During compilation you simply have to refer to the library which allows this modification of pcap’s in “offline” mode. What happens behind the scenes is the reading of the pcap file using some wrapper function that simply does a file open and read based on certain offsets, which happens to work great!

Thanks to some great wiki articles and references that I used:

With these articles I was able to piece together a solution to modifying the time stamps and re-build a valid PCAP file viewable in Wireshark.

Its not my best work but it did the trick, and hopefully can help anyone in the future who needs to modify a PCAP file.

#include <stdio.h>
#include <string.h>
#include <arpa/inet.h>
#include <syslog.h>
#include <errno.h>
#include <signal.h>
#include <unistd.h>
#include <net/ethernet.h>
#include <netinet/ether.h>
#include <net/if.h>
#include <netinet/ip.h>
#include <netinet/udp.h>
#include <netinet/tcp.h>
 
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
 
#include "pcap.h"
 
#if 0 
// For easy reference
typedef struct pcap_packet_hdr_s { 
	u_int32_t ts_sec;         /* timestamp seconds */
	u_int32_t ts_usec;        /* timestamp microseconds */
	u_int32_t capt_len;       /* number of octets of packet saved in file */
	u_int32_t orig_len;       /* actual length of packet */
} __attribute__((packed)) pcap_packet_hdr_t;
 
struct pcap_packet_s 
{
	pcap_packet_hdr_t hdr;
	u_int8_t data[0];
}__attribute__((packed)); 
typedef struct pcap_packet_s pcap_packet_t;
#endif 
 
int main(int argc, char *argv[])
{
    char ErrBuff [1024];
    int PacketCount, len;
    const u_char *PacketData;
    struct pcap_pkthdr *header;
 
    if(argc < 3){
        fprintf(stderr, "usage: ./pcap_parse in.pcap out.pcap\n");
        return -1;
    }
 
// ----- Read the header of the PCAP file
    int fp_for_header = open(argv[1],O_RDONLY);
    if(fp_for_header < 0){
        fprintf(stderr, "Cannot open PCAP file to get file header.\n");
        return -1;
    }
 
    u_int8_t data[sizeof(struct pcap_file_header)];
 
    len = read(fp_for_header, data, sizeof(struct pcap_file_header));
    if(len == 0){
        fprintf(stderr, "Could not read from file.\n");
    }	
    close(fp_for_header);
// ------- End read of header.
 
// ------- Open file for reading each packet
    int fp = open(argv[2],O_CREAT|O_WRONLY,S_IRWXU);
    if(fp < 0){
        fprintf(stderr, "Cannot open file: %s for writing.\n",argv[2]);
        return -1;
    }		
 
// Write header to new file
    len = write(fp, data,sizeof(struct pcap_file_header));
 
    pcap_t *pcap = pcap_open_offline(argv[1], ErrBuff);
    if (!pcap){
        fprintf (stderr, "Cannot open PCAP file '%s'\n",
                argv[1]);
                fprintf(stderr, "%s\n",ErrBuff);
                return -1;
    }
 
// Modify each packet's timestamp to be immediately after each other
    for (PacketCount = 0; pcap_next_ex (pcap, &header, &PacketData) > 0; PacketCount++){
        header->ts.tv_sec = 0;
        header->ts.tv_usec = PacketCount;
 
        len = write(fp,header,16);
        if(len == 0){
            fprintf(stderr, "Error occurred writing pcap_header.\n");
            pcap_close(pcap);
            close(fp);
            return -1;
        }
 
        len = write(fp,PacketData,header->caplen);
        if(len == 0){
            fprintf(stderr, "Error occurred writing pcap data.\n");
            pcap_close(pcap);
            close(fp);
            return -1;
        }	
    }
 
    pcap_close (pcap);
    close(fp);
    return PacketCount;
}

The premise of this program is, read in the header of the entire PCAP file that contains a ‘magic number’ which is used to tell the validity of the file within Wireshark, copy this to the head of the output file. Then read passed the header and use the pcap_open_offline and pcap_next_ex to read each packet individually and modify the timestamp of each packet as we go, then write the modified packet to the output file and voila.

This is just a simple example of modifying the timestamp(s), in theory you could modify the IP header of each packet and change the source and destination IP’s to suite a test network. Any value within the packet could be modified. Happy pcap modifying!

Random Number Generation In C and Bash

Well that was random! Or was it? Maybe it was pseudo-random…who knows? Random number generation is an important part of cryptography when seeding a hash or creating a cipher. But how can you generate a random number from a computer, if a computer is deterministic? This is where pseudo-random and random numbers come in. They are referred to as pseudo random because a computer is deterministic, however, the Linux kernel has implemented the /dev/random and /dev/urandom options for number generation that uses environmental noise from the hardware in your computer via the device driver of the device as a seed. Awesome, wikipedia states for /dev/random, which provides the ‘most’ random values:

In this implementation, the generator keeps an estimate of the number of bits of noise in the entropy pool. From this entropy pool random numbers are created. When read, the /dev/random device will only return random bytes within the estimated number of bits of noise in the entropy pool. /dev/random should be suitable for uses that need very high quality randomness such as one-time pad or key generation. When the entropy pool is empty, reads from /dev/random will block until additional environmental noise is gathered.

The /dev/urandom is non-blocking and thus will re-use the pool even with low entropy. There is also the C function call random() that can be used for pseudo-random number generation.

Lets get to some examples!

Using /dev/random For A Random Number

To grab a random number from /dev/random we simply do a read on the file handle into an unsigned integer and voila.

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
 
#define RANDPATH "/dev/random"
 
int main()
{
	int fp, ret;
	unsigned int rand;
 
        // Open as read only
	fp = open(RANDPATH, O_RDONLY);
	if(fp < 0){
		fprintf(stderr, "Could not open -%s- for reading.\n",RANDPATH);
		return -1;
	}
 
	ret = read(fp, &rand, sizeof(rand));
        if(ret < 0){
                fprintf(stderr, "Could not read from -%s-.\n",RANDPATH);
                close(fp);
                return ret;
        }
 
	fprintf(stdout, "Random value: %u\n", rand % 10); // Range from 0 to 10
 
	close(fp);	
	return 0;
}

Easy as that, if you are planning to do many reads, take into account the /dev/random will block if the entropy is too low. If you require a large set of random numbers, use /dev/urandom but be aware you will retrieve more psuedo-random values than random.

Retrieve A Pseudo-Random Number With ‘random()’

Perhaps your kernel does not support /dev/random or /dev/urandom and you simply need a pseudo-random value. This can be achieved using the random() function call, providing an initial seed prior to calling random(). For example:

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <time.h>
 
int main()
{
	int j;
 
	srand((unsigned int)time(NULL));
 
	// Account for inherent flaws using Modulo use divisor instead
	j = 1 + (int)( 10.0 * random() / ( RAND_MAX + 1.0 ) );
 
	fprintf(stdout, "Random value: %u\n", j);
 
	return 0;
}

This will produce a random value between 1 and 10. If you require a larger value multiply random by 100 for 1 to 100 etc. srand() is used to seed the random number generator, in this case I pass in the current time as our seed. time() returns a value in seconds, if you need to retrieve a set of pseudo-random values quicker than once a second, pass in nanoseconds or microseconds as a new seed.

Generate A Random Number In Bash

In Bash we can utilize the /dev/random file handle, if present in your kernel, using hexdump to grab values as hexidecimal.

rand=$((0x$(hexdump -n 1 -ve '"%x"' /dev/random) % 100))

This will give us a value between 1 and 100. This can only be used for a 1 byte value (2^8), as our length is only 1 byte. If you require more, you must specify in hexdump as -n X where X is the length of bytes and update your modules.

There is also an internal random number that Bash populates, but provides a pseudo-random integer in the range 0 – 32767, and should never be used for encryption keys.
It is simply $RANDOM.

For example for a value between 0 – 32767

number=$RANDOM

You can also use modules to grab a value to a maximum size via $RANDOM % 100 for example.

Was that random after all?

Displaying MAC Address as String Does Not Print Leading Zeroes

On many occasions it is beneficial to convert a MAC Address from its network byte order to a user readable string in the standard hex-digit and colon notation. There is a specific function that provides this functionality from the C library, known as “ether_ntoa” or “ether_ntoa_r”. Depending on the usage of this string it may be advantageous to print those leading zeroes. The manual page of this function, luckily, does inform the programmer of this functionality:

The ether_ntoa() function converts the Ethernet host address addr given in
network byte order to a string in standard hex-digits-and-colons notation,
omitting leading zeros.  The string is returned in a statically allocated
buffer, which subsequent calls will overwrite.

So now that this functionality is known – what if we actually wanted to see those zeroes. Well there are two options to fix this problem:

  1. Patch the libc function to print leading zeroes.
  2. Write your own conversion function to execute the same result.

A drawback to patching the libc function is that you may not always be compiling your code on the same computer – i.e. it is not very portable. Thus we will go with option two (2). Write our own function!

Lets first have a look at the current implementation of “ether_ntoa”.

char *ether_ntoa (const struct ether_addr *addr)
{
  static char asc[18];
 
  return ether_ntoa_r (addr, asc);
}

Hrmm, well look at this – “ether_ntoa” calls the re-entrant implementation of “ether_ntoa” which is the “ether_ntoa_r” function. This is a GNU extension and is thread-safe and also does not utilize a static buffer, which means you will not have to worry about overwriting your buffer. Okay, so let us examine the “ether_ntoa_r” function – note these functions can be found in the source code of libc in the glibc-2.17/inet directory.

char *ether_ntoa_r (const struct ether_addr *addr, char *buf)
{
  sprintf (buf, "%x:%x:%x:%x:%x:%x",
           addr->ether_addr_octet[0], addr->ether_addr_octet[1],
           addr->ether_addr_octet[2], addr->ether_addr_octet[3],
           addr->ether_addr_octet[4], addr->ether_addr_octet[5]);
  return buf;
}

Here is the bones of the function that requires minimal tweaking to display the leading zeroes. Its a basic sprintf function that stores the contents in buf. So lets tweak it to our own liking:

char *ether_ntoa_erik (const struct ether_addr *addr, char *buf)
{
  sprintf (buf, "%02x:%02x:%02x:%02x:%02x:%02x",
           addr->ether_addr_octet[0], addr->ether_addr_octet[1],
           addr->ether_addr_octet[2], addr->ether_addr_octet[3],
           addr->ether_addr_octet[4], addr->ether_addr_octet[5]);
  return buf;
}

This will now print the leading zeroes and all we had to do was modify the format. If you wanted to have your values all in capital letters just swap the lowercase “x” with an uppercase “X”.

Test Code for ether_ntoa_erik

Nothing is complete until it is tested, so here is a little test program I wrote that takes a basic MAC address with leading zeroes in it and then calls the “ether_ntoa_r” function then the function I made that will print leading zeroes. Lets check the code and output.

#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <netinet/ether.h>
#include <netinet/if_ether.h>
 
char *ether_ntoa_erik (const struct ether_addr *addr, char *buf)
{
  sprintf (buf, "%02x:%02x:%02x:%02x:%02x:%02x",
           addr->ether_addr_octet[0], addr->ether_addr_octet[1],
           addr->ether_addr_octet[2], addr->ether_addr_octet[3],
           addr->ether_addr_octet[4], addr->ether_addr_octet[5]);
  return buf;
}
 
int main()
{
    char str_buf[ETH_ALEN];
    struct ether_addr erik;
 
    erik.ether_addr_octet[0] = 0x0a;
    erik.ether_addr_octet[1] = 0xbb;
    erik.ether_addr_octet[2] = 0x0c;
    erik.ether_addr_octet[3] = 0x12;
    erik.ether_addr_octet[4] = 0x04;
    erik.ether_addr_octet[5] = 0x56;
 
    ether_ntoa_r(&erik, str_buf);
    fprintf(stdout, "ether_ntoa_r:    -%s-\n", str_buf);
 
    ether_ntoa_erik(&erik, str_buf);
    fprintf(stdout, "ether_ntoa_erik: -%s-\n", str_buf);
 
    return 0;
}

Voila…now the output:

$ ./main 
ether_ntoa_r:    -a:bb:c:12:4:56-
ether_ntoa_erik: -0a:bb:0c:12:04:56

As you can see from the output of the modified function there are now the leading zeroes. Try it out yourself! I hope this helps anyone needing zeros.

How To: Use strstr() in C

This is another addition to my earlier standard C string function how to’s. I have written in the past about strtol ( string to long ), as well as, strcpy ( used for copying strings ). In this ‘how to’ I would like to focus on using the strstr function, which allows you to locate a substring within your string. In most cases I end up using the strstr function for just detecting if a value/string exists within a string, essentially as a boolean check, rather than doing a lot of string manipulation with it. In contrast to using the strstr function, there is also a companion function known as strcasestr, which will ignore whether the letters are upper or lower case, this becomes quite handy when you are simply checking for presence of a string. If case is relevant to you, be sure to use strstr.

Function Definition of strstr

The strstr function is provided by the standard C string header file and used as follows:

// Header include
#include <string.h>
 
// Function definition
char *strstr(const char *haystack, const char *needle);

As outlined, the function definition is somewhat self explanatory. The first argument of the strstr function is haystack, you may have guessed it, but haystack is the string you would like to search within – for English speakers there is a saying of finding a needle in a haystack, which is exactly how this function is laid out. The second argument of strstr is needle. You guessed it, this is the string you would like to find within the first argument, haystack.

Using strstr

Now that you know how strstr is defined, the important part is using it correctly. I mentioned earlier that I usually treat strstr much like a boolean return. To do so we need to outline the return values of strstr:

  • If strstr finds the needle in the haystack it will return a pointer to the beginning of the needle within the haystack
  • If strstr cannot find the needle (string) within your haystack (string to search in) then it will return NULL

So this means to use this as a boolean return value, simply check for NULL. Of course this only applies if you plan to use it in this manner.

Here is a basic example:

#include <stdio.h>
#include <string.h>
 
int main(int argc, char **argv) {
    char s1[] = "Erik was here.";
    char *ptr;
 
    if((ptr = strstr(s1, "Erik")) == NULL)
        fprintf(stderr, "Could not find Erik\n");
    else
        fprintf(stdout, "Found String: -%s-\n", ptr);
 
  return 0;
}

In the first if statement we check for NULL, if NULL we realize the string does not exist within the haystack. The else case will print out the value of the string returned by the strstr function. To see if the first NULL check will work, remove the presence of “Erik” from the s1[] and recompile. The output will then become the “Could not find Erik”. Try it out for yourself!

There you have it, a basic explanation and example of using the C strstr function.

Interprocess Communication With Shared Memory In C

There are various ways to communicate from one process to another. A few ways to do so are through Unix domain sockets, sockets on the local loopback, signals between processes, open files, message queues, pipes, and even memory mapped files. One interesting way is to use shared memory segments using a key to identify where in memory the shared segment will be. I have done a fair amount of interprocess communication (IPC) using sockets on the loopback as well as signalling, so I thought it would be good to delve into shared memory as a medium for IPC.

The main idea behind shared memory is based on the server / client paradigm. The server maps a section of memory and the client may have access to that shared memory for reading or writing, in doing so there is a window between the two processes in which data can be exchanged.

There are a set of functions that are used to take advantage of using shared memory.

Functions For Accessing Shared Memory

To open this so called window we have to locate the memory segment, map it, then perform an action upon it. The humanity! Here are the headers and function definitions for basic shared segment mapping.

// Required header files
#include <sys/ipc.h>
#include <sys/shm.h>
 
// Function definition
int shmget(key_t key, size_t size, int shmflg);

This function returns the identifier associated with the value of the first argument key. The shmget function takes three parameters. The first parameter ‘key’ is an integer value used as the identifier in which both processes use to map the memory segment. The second parameter, ‘size’ is the amount of memory to map, where size is equal to the value of size rounded up to a multiple of PAGE_SIZE. You can view the system PAGE_SIZE from the command line via “getconf PAGESIZE”, for more getconf information check out my earlier article. In my case PAGESIZE is 4096 bytes. Finally, the third parameter is used for access permissions to the shared memory segment. The values are analogous to the open(2) permission settings. In our case we use IPC_CREATE | 0666.

// Required header files
#include <sys/types.h>
#include <sys/shm.h>
 
// Function definition
void *shmat(int shmid, const void *shmaddr, int shmflg);

The shmat function returns the attached shared memory segment. The first argument is the return value from the shmget function call. The second argument is shmaddr, if it is NULL, the system chooses a suitable (unused) address at which to attach the segment. The third argument is akin to the shmflg mentioned above and is used for permissions. The return value is a pointer to the shared memory and can be acted upon.

Finally, once all work is complete on our address. We can close our handle using shmdt.

// Required header files
#include <sys/types.h>
#include <sys/shm.h>
 
// Function definition
int shmdt(const void *shmaddr);

If you specified a value other than NULL on the shmat call, then you would pass that value into shmaddr. I just pass the return value from shmat into shmdt and check the return value. If shmdt fails it returns -1 and sets errno appropriately.

Those are the three main functions for setting up memory mapping between processes. I always find examples useful, so I have created a server and client example. The client grabs user input from standard input and writes it to memory then the server reads that input from the first byte of memory and prints it to the screen.

Example Program Using Shared Memory

The first bit of code here is the server code, it initially creates the shared memory segment and sets the permissions accordingly. Its goal is to listen for changes to the first byte in memory and display them to standard output. You will notice the use of shmget, shmat, and shmdt within my code.

Shared Memory Server Side

#include <sys/types.h>
#include <sys/ipc.h>
#include <sys/shm.h>
#include <stdio.h>
#include <unistd.h>
#include <string.h>
 
#define SHMSZ 1024
 
main(int argc, char **argv)
{
	char c, tmp;
	int shmid;
	key_t key;
	char *shm, *s;	
 
    /*
     * Shared memory segment at 1234
     * "1234".
     */
	key = 1234;
 
    /*
     * Create the segment and set permissions.
     */
	if ((shmid = shmget(key, SHMSZ, IPC_CREAT | 0666)) < 0) {
		perror("shmget");
		return 1;
	}
 
    /*
     * Now we attach the segment to our data space.
     */
	if ((shm = shmat(shmid, NULL, 0)) == (char *) -1) {
		perror("shmat");
		return 1;
	}
 
	/*
	 * Zero out memory segment
	 */
	memset(shm,0,SHMSZ);
	s = shm;
 
	/*
	* Read user input from client code and tell
	* the user what was written.
	*/
	while (*shm != 'q'){
		sleep(1);
		if(tmp == *shm)
			continue;
 
		fprintf(stdout, "You pressed %c\n",*shm);
		tmp = *shm;
	}
 
	if(shmdt(shm) != 0)
		fprintf(stderr, "Could not close memory segment.\n");
 
	return 0;
}

Shared Memory Client Side

This client uses getchar() to retrieve user input and stores it in the first byte of the memory segment for the server to read.

#include <sys/types.h>
#include <sys/ipc.h>
#include <sys/shm.h>
#include <stdio.h>
#include <unistd.h>
#include <string.h>
 
#define SHMSZ     1024
 
main()
{
	int shmid;
	key_t key;
	char *shm, *s;
 
	/*
	* We need to get the segment named
	* "1234", created by the server.
	*/
	key = 1234;
 
	/*
	* Locate the segment.
	*/
	if ((shmid = shmget(key, SHMSZ, 0666)) < 0) {
		perror("shmget");
		return 1;
	}
 
	/*
	* Now we attach the segment to our data space.
	*/
	if ((shm = shmat(shmid, NULL, 0)) == (char *) -1) {
		perror("shmat");
		return 1;
	}
 
	/*
	* Zero out memory segment
	*/
	memset(shm,0,SHMSZ);
	s = shm;
 
	/*
	* Client writes user input character to memory
	* for server to read.
	*/
	for(;;){
		char tmp = getchar();
		// Eat the enter key
		getchar();
 
		if(tmp == 'q'){
			*shm = 'q';
			break;
		}
		*shm = tmp;
	}
 
	if(shmdt(shm) != 0)
		fprintf(stderr, "Could not close memory segment.\n");
 
	return 0;
}

The output of the two programs is as follows. I start the server first as it creates the memory segment, then within the client I enter the characters I want the server to output.

. . . 

This is a basic example of passing data between processes. Rather than simply modifying one character in memory I could pass a structure containing various fields or structs within structs, you are only limited by the amount of memory and imagination as to what you could pass back and forth. There you have it, a basic example of two separate processes passing data between each other.

Inline Assembly In Your C Program

So you think you’re a hacker and you want to embed (inline) assembly into your C program. Well in many cases it is this low level material that can exploit other programs. The ability to manipulate registers can be very powerful. I am more thinking about the efficiency advantages that can be done writing a function in assembler rather than C. This used to be more common with early compilers, but these days most compilers can produce more efficient assembly than a human. There are some circumstances, usually in the embedded systems environment, where it can be advantageous to embed assembly into your C program.

The older versions of GNU C aka GCC compiler for Linux used the AT&T assembler syntax instead of the Intel assembler syntax, nowadays you can add in a directive to inform the compiler to use the Intel syntax. The Intel syntax is more reflective of the most common assembly paradigms, the Intel syntax has the first operand as the destination, and the second operand as the source whereas in AT&T syntax the first operand is the source and the second operand is the destination.

Okay, lets get on with it. The examples of assembly are for the x86 architecture.

Embed Assembly In C

To embed your assembly instructions into your C program we use the inline asm call. We can also use __asm__( to ensure there are no variables with the name asm that we may interfere with.

int main()
{
   __asm__("movl %esp,%eax");
 
  return 0;
}

This little function moves a long from register ‘a’ to the stack pointer. As you can see its quite easy to embed assembly into your C program. Lets do something ‘useful’.

#include <stdio.h>
#include <unistd.h>
 
int main()
{
  int data1 = 2, data2 = 3;
 
  // Fancy assembly statement
  __asm__("addl  %%ebx,%%eax"
                    :"=a"(data1)
                    :"a"(data1), "b"(data2)
              );
 
  fprintf(stdout, "data1 + data2 = %d\n", data1);
 
  return 0;
}

The above example is the addition of two longs. The syntax I am using is the AT&T syntax that is expanded:

__asm__("<asm routine>" : output : input : modify);

In this case I add data1 and data2, the output is in “=a” or register ‘a’, the input is data1 in register ‘a’ and data2 in register ‘b’. So at the end I use my C fprintf statement to output the result of 5.

If you want to specifically use the Intel syntax you can inform the compiler of this by using:

__asm__(".intel_syntax noprefix\n\t"
                     "pop edx\n\t"
                     "mov eax,edx\n\t"

The .intel_syntax tells the compiler that the form being used is Intel.

Embed Volatile Assembly

In C you can declare a function as volatile. What is a volatile function? Volatile informs the compiler that this piece of code should not be moved or re-arranged as an efficiency optimization, this keeps the code exactly in the order you have typed it. For example:

__asm__ __volatile__("blah"

Now you can become the hacker you always dreamed of becoming!