How To: Modify Bash Environment in C

This article will outline how to modify environment variables from a running C process. It should be noted that setting variables from your C process into the environment will only persist during process lifetime. Environment variables are propagated forward to a child program, but not backward to the parent.

Earlier I wrote about creating, removing, changing environment variables directing in Bash, now I will show you how to do these modifications using C.

Creating An Environment Variable In C process

Creating the environment variable can easily be achieved using the C function int setenv(const char *name, const char *value, int overwrite);. It takes two parameters, the first parameter being the name of the environment variable, and the second being a flag outlining whether to overwrite a pre-existing variable of the same name.

#include<stdio.h>
#include <stdlib.h>
 
int main()
{
	char *name = "DATA";
	char *value = "erik";
 
	if(setenv(name, value, 1) < 0){
		fprintf(stderr, "Could not create environment variable.\n");
		return -1;
	}
 
	fprintf(stdout, "-%s-\n", getenv("DATA"));
	return 0;
}

The variable will persist for any children you may fork, or if you execute a system call to grab the variables.

Removing An Environment Variable Using unsetenv

To remove an environment variable using your C process, you may call int unsetenv(const char *name);. Where char *name is the name from the environment. If name does not exist in the environment, then the function succeeds, and the environment is unchanged.

unsetenv(name);

Clear The Environment Using clearenv

To clear the entire environment you can:

clearenv();

Or

environ = NULL;

Reading An Environment Variable Using getenv

To just read an environment variable there is the C call, char *getenv(const char *name);. This will return the value of the environment variable.

getenv(“data”);

Those are the basic functions you can execute from your C process to set, remove, create those environment variables.

C getchar() Usage and Examples

The getchar() function is used to grab character by character inputs from the standard input stream. getchar() is special in that it takes a void as its argument, i.e. nothing and it returns the next character from standard input. This can be used for basic input into any c program.

I will outline a basic example here.

How To Use getchar()

The getchar() function, as I mentioned above is quite basic, I will show you how to read input from standard in and print it back out to the terminal as standard out.

#include <stdio.h>
 
int main()
{
        char c;
        for(;;){
                c=getchar();
                if(c == 'q') // Compare input to 'q' character
                        break;
                fprintf(stdout, "%c\n", c);
        }
        return 0;
}

To compile this program:

erik@debian:~/getchar_ex$ gcc -o getchar_test getchar_test.c

Using this function:

erik@debian:~/getchar_ex$ ./getchar_test
a
a
 
 
b
b
 
 
c
c
 
 
q
q
erik@debian:~/getchar_ex$

As you can see, we read in each character then the process prints it back out. Nothing to it really.

getchar() Return Values

getchar() will return an unsigned char that is internally cast to an int. If there is an error or end of line the function will return an EOF (end of file).

C Debugging Macros

Any experienced programmer can relate to sprinkling their code with printf statements to try and figure out a NULL pointer, or perhaps whether a function has been reached or not. This is the first step towards debugging code.   Further down the line can include using GDB or Valgrind to check code entry paths, modify variables during runtime or check memory usage or look for memory leaks from non-free malloc’d data upon program exit.  These tools are usually pulled out when the going gets tough!  As an initial pass during basic unit testing debug information is extremely handy.  What I usually attempt to do is create a header file (*.h) that can be included in any (*.c) files I happen to be working on.  This way, if I modify my macro it only requires a change in one place to complete.  For example:

#include <stdio.h>
#include <stdlib.h>
 
#ifdef DEBUG
#define DEBUGP(x, args ...) fprintf(stderr, " [%s(), %s:%u]\n" \
x, __FUNCTION__, __FILE__,__LINE__, ## args)
#else
#define DEBUGP(x, args ...)
#endif
 
void calculation() {
	DEBUGP("data: %u\n",5 * 5);
}
 
int main()
{
	DEBUGP("Program started.\n");
	calculation();
	return 0;
}

Now every time we want to output any debug information we can call DEBUGP(..data..);.  In doing so, we also get the file name, the function it was called in, and what line it occurred at.  This is extremely helpful for debugging purposes.  This macro utilizes the built-in C macros of __LINE____FUNCTION__, and __FILE__.   Another piece to note is, I have #ifdef DEBUG, this means that if this variable is defined then we can reference DEBUGP, if the DEBUG variable is not defined then we call our “#define DEBUGP(x, args …)“, which does not print out any debug data.  This means during compilation we can pass the -DDEBUG flag to the compiler, meaning we can turn on/off our debugging flag effortlessly.

So putting this all together, lets compile the program with the flag enabled.

erik@debian:/debug$ gcc -DDEBUG -o debug debug.c

Now that it is compiled lets run it.

erik@debian:/debug$./debug
[main(), debug.c:18] Program started.
[calculation(), debug.c:12] data: 25

So as we can see, with a little bit of organization and taking advantage of C macros we can make debugging a lot easier for ourselves!

Convert IP Address to Decimal in C

Ever needed to convert an IP address as a string (dot – quad) to its decimal representation? Here is a quick program I wrote in C that will convert an IP address string from the command line into its decimal representation, then from that decimal representation back into its dot quad form.

#include<stdlib.h>
#include <stdio.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <arpa/inet.h>
#include <string.h>
#include <errno.h>
 
int main(int argc, char *argv[])
{
        struct in_addr addr;
        if(argc < 2){
                fprintf(stderr, "usage: ./quad_to_byte [ip_address]\n");
                return -1;
        }
 
        char *byte_order = malloc(sizeof(argv[1]));
	if(!byte_order){
		fprintf(stderr, "Could not allocate memory for conversion.\n");
		return -1;
	}
 
	// Convert address to byte order
        if(!inet_pton(AF_INET, argv[1], &addr)){
                fprintf(stderr, "Could not convert address\n");
		free(byte_order);
                return -2;
        }
 
	// Print out network byte order
        fprintf(stdout, "Network byte order: %d\n",addr.s_addr);
 
	// Convert it back to our dot quad to verify
	if(inet_ntop(AF_INET, &addr.s_addr, byte_order, sizeof(argv[1])*2) == NULL){
                fprintf(stderr, "Could not convert byte to address\n");
                fprintf(stderr, "%s\n",strerror(errno));
		free(byte_order);
		return -3;
        }
 
	// Display our dot quad converted from the network byte order
	fprintf(stdout, "Dot quad: %s\n",byte_order);
	free(byte_order);
        return 0;
}

Then to compile it you simply run:

gcc -o quad_to_byte quad_to_byte.c

Then running the simple program:

erik@debian:$ ./quad_to_byte 10.3.4.2
Network byte order: 33817354
Dot quad: 10.3.4.2

Its really that easy. If you need to only do the conversion in one direction simply use only inet_pton to go from the IP address to decimal and in the other direction use inet_ntop.

Creating a Pipe in C

What is a pipe in C?

A pipe in C is a unidirectional data channel that can be used for interprocess communication. One process may write to the pipe, while another process may read from it. This was you can communicate from one process to another through this channel.

How do I use a pipe in C?

Here is a quick example from the pipe linux man page of setting up a pipe in C, with a parent process writing to the pipe and a child process reading from it.

#include <sys/wait.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
 
int
main(int argc, char *argv[])
{
	int pipefd[2];
	pid_t cpid;
	char buf;
 
	if (argc != 2) {
		fprintf(stderr, "Usage: %s \n", argv[0]);
		exit(EXIT_FAILURE);
	}
 
	if (pipe(pipefd) == -1) {
		perror("pipe");
		exit(EXIT_FAILURE);
	}
 
	cpid = fork();
	if (cpid == -1) {
		perror("fork");
		exit(EXIT_FAILURE);
	}
 
	if (cpid == 0) {    /* Child reads from pipe */
		close(pipefd[1]);          /* Close unused write end */
 
		while (read(pipefd[0], &amp;buf, 1) &gt; 0)
			write(STDOUT_FILENO, &amp;buf, 1);
 
		write(STDOUT_FILENO, "\n", 1);
		close(pipefd[0]);
		_exit(EXIT_SUCCESS);
 
	} else {            /* Parent writes argv[1] to pipe */
		close(pipefd[0]);          /* Close unused read end */
		write(pipefd[1], argv[1], strlen(argv[1]));
		close(pipefd[1]);          /* Reader will see EOF */
		wait(NULL);                /* Wait for child */
		exit(EXIT_SUCCESS);
	}
}

C Shared Libraries – Static and Dynamic

As a programmer I often find myself having to decide not only how to organize my code within a set of files, but also deciding if this code may be used by another program for some inter-process communication or as a generic piece of code that I could see many other programs using, a utilities library for example. This leads to the idea of shared components or shared libraries. These libraries may be referenced (linked) by your program during compilation or loaded during runtime with a set of tools. In fact, if you have done any threaded work on a Linux system you have probably come across a reference to -lpthread. This tells the compiler to link to that library during compilation. A library is also a way of releasing an API to interface with an application instead of having to compile in those functions. Libraries also reduce the size of a program because the code is in one single place and may be referenced by many applications at once.

Types of Linux Libraries

In Linux there are two types of C/C++ libraries that can be made.

  • Dynamically linked shared object libraries (*.so) files.
  • Static libraries (*.a) files, this is linked in and becomes part of the application.

The dynamic shared object files can be linked at runtime, and must be already compiled and reachable during the compilation and linking phase, and are not included within the executable. The other option is to load and unload using the dynamic linking system.

The naming schemes include the prefix of lib, if you look in the /usr/lib/ directory you will see a few hundred libraries used by your running Linux system. When linking against your program you simply use -l then the name, so for p-threads as shown above you link with -lpthread.

How Do You Generate A Static Library?

To create a static library (ending in *.a) you compile your program as you would normally:

gcc -Wall -c temp_test.c

This will create a library_data.o file, or object file. To create the library the next step is to use the ar utility.

ar -cvq libtemp_test.a temp_test.o

Voila, you now have a static library that can be used during the compilation phase. For example:

gcc -o eriks_program eriks_program.c libtemp_test.a
--- OR ---
gcc -o eriks_program eriks_program.c -ltemp_test

How Do You Generate A Dynamic Library?

When creating the dynamic library a few more flags during compilation are required. The creation of the object code or object file is required in the same way it is for the static library, however the -fPIC flag is required. It states that position independent code needs to be outputted for a shared library. Here are the steps required:

gcc -Wall -fPIC -c temp_test.c
gcc -shared -Wl,-soname,libtemp_test.so.1 -o libtemp_test.so.1.0 temp_test.o
mv libtemp_test.so.1.0 /lib
ln -sf /lib/libtemp_test.so.1.0 /lib/libtemp_test.so
ln -sf /lib/libtemp_test.so.1.0 /lib/libtemp_test.so.1

These steps create the libtemp_test.so.1.0 library as well as the symbolic links that allow compilation against -ltemp_test and run time binding. So what do these compiler options mean? I will save you the trouble from having to look them up:

  1. -Wall: include warnings.
  2. -fPIC: Compiler directive to output position independent code, a characteristic required by shared libraries.
  3. -shared: Produce a shared object which can then be linked with other objects to form an executable.
  4. -W1: Pass options to linker.

Now to compile a program using the libtemp_test.so.1.0 library you simply:

gcc -Wall -I/path/to/include-files -L/path/to/libraries eriks_program.c -ltemp_test -o eriks_program
--- OR ---
gcc -Wall -L/lib eriks_program.c -ltemp_test -o eriks_program

Et voila!

C Switch Statements

When I first began programming I remember I designed a tic-tac-toe game in C that had a massive if-else statement, probably with 15 or so if, else if’s…it was awful and it looked awful too. Since then I have moved on to using the switch statement. Using a switch statement in any imperative programming language usually has many advantages including: easier to read, easier to debug, easier to understand, and easier to maintain. There are also code efficiency benefits because the compiler will implement the switch statement as an indexed branch table (or jump table). This improves the programs flow immensely.

I would like to compare an if statement then show you its implementation via a switch statement.

The Gross If Else

int x;
 
if(x == 1){
 // ... do something
}else if(x == 2){
 // ... do something
}else if(x == 5){
 // ... do something
}else if(x == 12){
 // ... do something crazy
}

This could go on for ages, and as you can see everything sort of get mumbled together. It is not as clear as it could be. Okay, lets compare that same example but I will take advantage of the C switch statement.

Yay Switch Statement!

int x;
 
switch(x){
     case 1:
        // ... do something
        break;
     case 2:
        // ... do something
         break;
     case 5:
        // ... do something
        break;
     case 12:
        // ... do something crazy
        break;
     default:
        // ... we didnt match anything
         break;
     }

The default case is a ‘catch-all’, meaning if ‘x’ doesn’t match any of the above options then default will be hit, in most cases it is not relevant because the programmer will know which values will be seen or not seen.

The break statements tell the program to go to the end of the switch statement. The switch statement is elegant in that you can use this to better organize functionality. For example, case one could indicate process this type of data, i.e. traverse this path, and case two could indicate to remove this type of data, i.e. traverse the delete data path. A limitation of the switch statement is its inability to deal with strings. The switch can only act upon numeric values. However, you can do some useful things by creating an enumeration so your switch statement can then have more descriptive names to outline a code path. For instance:

enum state
{
   STATE_UP = 0,
   STATE_DOWN,   // This will automatically be 1
   STATE_LEFT,   // 2
   STATE_RIGHT   // 3
};
 
switch( state )
{
   case STATE_UP:
       call_up_function();     
       break;
 
   case STATE_DOWN:  
       call_down_function();   
       break;
 
   case STATE_LEFT: 
       call_left_function();    
       break;
 
   case STATE_RIGHT:
   case 12:
       call_right_function();
       break;
}

Notice in the above example I combine STATE_RIGHT and 12, that means if we match on either STATE_RIGHT or 12 then the call_right_function() will be executed. This is known as a fall-through block, it would be equivalent to:

if(x == STATE_RIGHT || x == 12){
   call_right_function();

There you have it. In using a switch statement you can clean up your code, make it more efficient and easier to read!

C ‘sizeof’ Operator Explanation and Examples

The sizeof operator is used immensely in C. I use it every time I allocate memory dynamically. For the longest time I thought it was a function, and always found it strange that it was never defined in a manual page. Further investigation showed me that sizeof is an operator that is specific to your compiler (which makes sense) when you think about how a program is built. During compile time, the compiler determines the type passed in and will return the size of that type in bytes.

sizeof is a unary operation, meaning it takes only one operand or argument.

Here is an example program using sizeof for differing data types, including a structure with an array.

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
 
struct data_len {
        char data1;
        int data2;
        long data3;
        double data4;
        int data5[10];
};
 
int main(int argc, char **argv)
{
        // Examples of the sizeof operator.
        fprintf(stdout, "%zu\n%zu\n%zu\n%zu\n%zu\n%zu\n",
                        sizeof(char),
                        sizeof(int),
                        sizeof(double),
                        sizeof(long),
                        sizeof(unsigned int),
                        sizeof(struct data_len)
                        );
 
        return 0;
}

The output:

$ ./sizer
1   // char
4   // int
8   // double
4   // long
4   // unsigned int
60  // structure of sizes char + int + long + double + (int x 10)

This is a handy and necessary operator to have, especially for figuring out how many bytes you need when allocating memory on the fly. Even more importantly, a long on a 32bit architecture differs from a long on a 64bit machine so if you statically decided that a long was 4 bytes and allocated memory based on that then moved your code to a 64bit architecture your calculations would be off. This is where buffer overflows can occur. If you had allocated based on a long being 4 bytes then went to a 64bit architecture (where a long is 8 bytes), every reference to the long would automatically fill the other 4 bytes and your data would be overflowing into who knows where.

A cool trick with sizeof is it can be used to find the length of an array. At first you might think well..I know what the length of the array is since I declare it. But if you are making an array dynamically, you won’t know what the size of that array is until you have allocated the space. What you can do in C is:

int *dptr, len;
 
// Allocate as many integers as passed in from the command line
dptr = (int *)malloc(sizeof(int) * atoi(argv[1]));
 
len = sizeof(dptr) / sizeof(dptr[0]);

This length check is only safe to do, as long as dptr is defined within the same scope as the calculation. Otherwise, if you pass dptr to a function, depending on your architecture, the len may be calculated on a pointer rather than the object itself and thus give you the wrong values.

A common mistake that arises with the use of sizeof, is when sizeof() evaluates on a pointer to say an array, rather than the array itself. If we have two strings, with one as a const char *str_ptr and the other as const char str[] = “howdy” then do a sizeof on both, the const char *str_ptr will always return 1, but if we did a sizeof(arr), then we will get back 6 bytes because we have six characters including the endline character. You have to pass in the actual array rather than a pointer to one.

What cool tricks have you done with sizeof?

C strtol – String to Long

Yesterday I wrote about the atoi function for converting ascii to integers or strings to integers. I also hinted that using the strtol function, or string to long can do everything that atoi does, but with more features. In fact, I also mentioned that atoi is essentially deprecated and should no longer be used. Lets take a look at strtol.

Advantages of strtol

The strtol function has a few advantages over the atoi function, including:

  1. strtol – will remove any whitespace at the start of the string if any exists.
  2. strtol – will set errno on error.
  3. strtol – provides support for any base from 2 to 36 inclusive.
  4. strtol – can identify negatives values within the string ‘-’ as negative.

strtol Function Prototype and Example

As I always say, an example always makes understanding a bit easier. The function prototype for strtol is:

#include <stdlib.h>
 
long int strtol(const char * str, char ** endptr, int base);

The first argument is the string to convert. The second argument is a reference to an object of type char*, whose value is set by the function to the next character in str after the numerical value. This parameter can also be a NULL pointer, in which case it is not used. In most cases you can simply use the NULL pointer, unless your string has multiple values separated by spaces, then you must pass the endptr to the next iteration of the function. Finally, the third argument is the base in which to use. If you have a binary string then base 2 is your choice, or if you are working with integers then base 10 is your game, base 16 for hexadecimal and so on. Easy!

For your viewing pleasure, a C example:

#include <stdio.h>
#include <stdlib.h>
 
int main ()
{
  char basic_str[] = "8887";
  char data_string[] = "1234 1a2b3c4e -10101010 0xdeadbeef";
  char *end_char;
  long int basic_1, d1, d2, d3, d4;
 
  // Basic conversion
  basic_1 = strtol(basic_str, NULL, 10);
  fprintf(stdout, "Basic string %ld\n", basic_1);  
 
  // Using the endptr
  d1 = strtol(data_string,&end_char,10);
  d2 = strtol(end_char,&end_char,16);
  d3 = strtol(end_char,&end_char,2);
  d4 = strtol(end_char,NULL,0);
  fprintf(stdout, "The converted values are: %ld, %ld, %ld and %ld.\n", d1, d2, d3, d4);
 
  return 0;
}

The output from running this program:

$ ./strtol 
Basic string 8887
The converted values are: 1234, 439041102, -170 and 3735928559.

As you can see from the example there are two ways to utilize the strtol function. The first is to simply convert a known good string with a known quantity. The second usage is to take advantage of the endptr knowing that the string to be parsed contains multiple values for conversion and thus specify the endptr rather than using NULL.

Keep in mind, I could check the result of each call to strtol because if an error occurred LONG_MIN or LONG_MAX is returned and errno is set. There is a handy program from the manual pages that checks the results of strtol. I will include it here for completeness.

#include <stdlib.h>
#include <limits.h>
#include <stdio.h>
#include <errno.h>
 
int main(int argc, char *argv[])
{
  int base;
  char *endptr, *str;
  long val;
 
  if (argc < 2) {
    fprintf(stderr, "Usage: %s str [base]\n", argv[0]);
    exit(EXIT_FAILURE);
  }
 
  str = argv[1];
  base = (argc > 2) ? atoi(argv[2]) : 10;
 
  errno = 0;    /* To distinguish success/failure after call */
  val = strtol(str, &endptr, base);
 
  /* Check for various possible errors */
  if ((errno == ERANGE && (val == LONG_MAX || val == LONG_MIN))
      || (errno != 0 && val == 0)) {
    perror("strtol");
    exit(EXIT_FAILURE);
  }
 
  if(endptr == str) {
    fprintf(stderr, "No digits were found\n");
    exit(EXIT_FAILURE);
  }
 
  /* If we got here, strtol() successfully parsed a number */
  printf("strtol() returned %ld\n", val);
 
  if (*endptr != '\0')        /* Not necessarily an error... */
    printf("Further characters after number: %s\n", endptr);
 
  exit(EXIT_SUCCESS);
}

While the above may seem overkill, you could potentially make this a wrapper function for a strtol converter, and thus all errors could be caught and success otherwise. There you have it, a few examples of using strtol and a few of the reasons why it is recommended to use this over atoi.

ASCII To Integer – atoi in C

If you have ever had to read input from a command line into your C program you may have stumbled across the atoi function, especially if you were reading numeric input. Perhaps you have a file that contains a set of stock data that needs to be converted to integeres. To interpret and use the values you have to convert it from its ASCII representation to its integer equivalent. Only languages that deal in types have to deal with these conversions, yay C! Luckily there are built-in functions like atoi to use.

atoi or Ascii to Integer can be found within the stdlib.h header file. Its prototype is:

#include <stdlib.h>
 
int atoi(const char *nptr);

The atoi function simply takes a character array/pointer and returns the converted value, this value can then be stored as an integer and off you go. Lets look at an example:

#include <stdlib.h>
#include <stdio.h>
 
int main(void)
{
  char *str = "6543";
  char *str2 = "123erik";
 
  fprintf(stdout, "The string %s as an integer is = %d\n",str,atoi(str));
  fprintf(stdout, "The string %s as an integer is = %d\n",str2,atoi(str2));
 
  return 0;
}
---- Output ----
The string 6543 as an integer is = 6543
The string 123erik as an integer is = 123

Notice in the example I am using %d to print a decimal value, which works fine as the return value of atoi is an integer.

atoi Limitations

  1. atoi is not thread-safe. This means that two threads could not call atoi simultaneously.
  2. atoi does not permit asynchronous calls on some operating systems.
  3. atoi will only convert from base 10 ASCII to integer.
  4. atoi is actually deprecated and strtol should be used in its place.
  5. atoi will not set errno on error.

It is now suggested that during conversion the strtol (string to long) function should be used. It can convert the string to any base from 2 to 36 inclusive, errno is set on error to detect problems with its return values. I will look at strtol in more detail in my next post. If you just need a quick conversion from ASCII to integer atoi, however, may be your way to go.