C: Unions

Conceptually, unions are similar to structures, but they differ in terms of storage. In structures, each member is allocated its own storage location. Union members, on the other hand, use only a single memory location, which is as much as the size of the largest among them.

As in structures, defining a union in C starts with the keyword union, followed by a given name and a pair of curly braces inside which are data fields known as elements or members. Here we define a union named film with three members: title, runtime and imdb


union film {
	char title[16];		
	unsigned runtime;
	float imdb;	
}

And as in structures, character arrays inside union need to be of a specified size and cannot be flexible. Hence, declaring the first member as char title[] without specifying the size would have thrown an error.

Memory Allocation

To illustrate the difference in memory allocation to structures and unions, we need to define a structure, say movie, having the same members as that of the previously declared union film.

For the members of this newly defined structure,

  • 16 bytes of memory is allocated to the character array title,
  • 4 bytes to runtime of type unsigned int, and
  • another 4 bytes to imdb of type float.

Summing it all up, we get a total of 24 bytes, which is also the size of the structure as computed by the sizeof operator inside the first printf() statement.

For the union, however, we get a lesser size of just 16 bytes, despite having exactly the same members as that of the defined structure movie. This is because the allocated storage is the size of the largest member of the union, which is the character array title of 16 bytes.


#include <stdio.h>

struct movie {
	char title[16];
	unsigned runtime;
	float imdb;
};

union film {
	char title[16];
	unsigned runtime;
	float imdb;
};

struct movie m;
union film f;

main() {	
	printf("struct: %lu \n", sizeof(m)); //24
	printf("union: %lu", sizeof(f)); //16
}

Contrast to what was written in old text books, in modern 64-bit computers, the size of unsigned int data type is usually 4 bytes. A quick check can be done on this

	
printf("%lu", sizeof(unsigned));

Nota Bene: The sizeof operator on a structure/union variable may not always return the exact number of bytes supposedly occupied by each of its members. It always seems to be rounding off, and the total number of returned bytes are always extra. These extra bytes are inserted by the compiler for alignment purposes and are generally known as "padding bytes."

Assigning and Accessing Values of Members

As with structure members, assigning and accessing values of the members of a union variable is achieved using the dot (.) operator. However, while accessing union members, we should always access the member to which a value is assigned latest. Below, the latest/current assigned member is f.imdb. Therefore, trying to access the value of f.runtime, a previously assigned member, gives us a wrong value

	
f.runtime = 98;
f.imdb = 7.7;
printf("%u", f.runtime);

Union Initialization

A union variable can be initialized with only one value, which is for the first member. If two or more values are inserted, the compiler will throw a warning of "excess elements."


union film {
	char title[16];
	unsigned runtime;
	float imdb;
};

union film f = {"Rocketship X-M"};