C

he doesn't get respect because he doesn't have class

25 Sep 2023

I spent some time adding type annotations to some Python code. Shoutout to all the people at Logilab, a really awesome team doing really awesome things. All those type annotations made me more and more interested in static typing... and in C programming.

Target audience: people who already have an experience with programming. People who may have written some C programs in the past and who want to refresh their memory.

Context: I'm working on:

I'm working on Linux 5.13.0-52-generic #59-Ubuntu SMP Wed Jun 15 20:17:13 UTC 2022 x86_64
And I'm using GCC (Ubuntu 11.2.0-7ubuntu2) 11.2.0

Music please

The setup:

To write C programs, we need:

An operating system that respect your freedom and privacy.
A C compiler such as gcc or clang.
A text editor that respects your freedom and privacy.
(Optional) a Bash script to generate the project boilerplate.
(Optional) a Makefile to format, build, install and run the program quickly.

Types:

Variable, a named location in the computer's memory where data can be stored and accessed. Type, define the size, the format, the behavior, the operations that can be performed on the variable. Let's talk about the common types in C programming.

Primitive types:

Primitive data types, or "basic data types" or "fundamental data types" are the most basic data types that can be used for representing simple values such as numbers, characters, etc... The primitive data types are...

Integer types

char: represents a single character (usually 1 byte)
short: represents a small integer
int: represents an integer
long: represents a long integer
long long: represents a very long integer
unsigned short: represent a small positive integer
unsigned int: represents a positive integer
unsigned long: represents a long positive integer
unsigned long long: represents a very long positive integer

This list is not exhaustive, there is also what we call fixed width integer types and what we call minimum width integer types. We won't talk about them here, please browse this link for reference.

To find the range of a particular type, we can look for the limits.h file online. Or, we can also look for the same file inside our computer:

lim  (trunk *%) >> find /usr -type f -name "limits.h"
/usr/src/linux-headers-5.13.0-51/include/linux/limits.h
/usr/src/linux-headers-5.13.0-51/include/uapi/linux/limits.h
/usr/src/linux-headers-5.13.0-51/include/vdso/limits.h
/usr/src/linux-headers-5.13.0-52/include/linux/limits.h
/usr/src/linux-headers-5.13.0-52/include/uapi/linux/limits.h
/usr/src/linux-headers-5.13.0-52/include/vdso/limits.h
/usr/include/linux/limits.h
/usr/include/c++/11/tr1/limits.h
/usr/include/limits.h
/usr/lib/gcc/x86_64-linux-gnu/11/include/limits.h
/usr/lib/x86_64-linux-gnu/perl5/5.32/Tk/pTk/compat/limits.h
/usr/lib/llvm-13/lib/clang/13.0.0/include/limits.h
lim  (trunk *%) >>

If we open, let's say, the file /usr/include/limits.h, we can see interesting informations like :

#   define CHAR_MAX     UCHAR_MAX
#  else
#   define CHAR_MIN     SCHAR_MIN
#   define CHAR_MAX     SCHAR_MAX
#  endif

/* Minimum and maximum values a `signed short int' can hold.  */
#  define SHRT_MIN      (-32768)
#  define SHRT_MAX      32767

/* Maximum value an `unsigned short int' can hold.  (Minimum is 0.)  */
#  define USHRT_MAX     65535

/* Minimum and maximum values a `signed int' can hold.  */
#  define INT_MIN       (-INT_MAX - 1)
#  define INT_MAX       2147483647

/* Maximum value an `unsigned int' can hold.  (Minimum is 0.)  */
#  define UINT_MAX      4294967295U

/* Minimum and maximum values a `signed long int' can hold.  */
#  if __WORDSIZE == 64
#   define LONG_MAX     9223372036854775807L
#  else
#   define LONG_MAX     2147483647L

Floating-point types:

Used to represent real numbers with decimal points. More informations here

float: represents a single-precision floating-point number
double: represents a double-precision floating-point number
long double: represents an extended-precision floating-point number

Here is an example showing how to use float values:

#include <stdio.h>
#include <stdlib.h>

int main() {
  char input[100];
  float radius, area;

  printf("Enter the radius of the circle: ");
  if(fgets(input, sizeof input, stdin) != NULL) {
    radius = strtof(input, NULL);
    area = 3.14 * radius * radius;
    printf("Area of the circle: %.2f\n", area);
  }
  exit(EXIT_SUCCESS);
}

I have found the file containing all the constants definitions for float values, but I am not (yet) able to correctly understand what's going on in this header file:

lim  (trunk *%) >> find /usr -type f -name "float.h"
/usr/include/tcl8.6/tcl-private/compat/float.h
/usr/include/c++/11/tr1/float.h
/usr/lib/gcc/x86_64-linux-gnu/11/include/float.h
/usr/lib/llvm-13/lib/clang/13.0.0/include/float.h
lim  (trunk *%) >>

Here is a snippet of things that may be of interest for anyone looking for the ranges:

#if defined __STDC_VERSION__ && __STDC_VERSION__ > 201710L
/* Maximum finite positive value with MANT_DIG digits in the
   significand taking their maximum value.  */
#undef FLT_NORM_MAX
#undef DBL_NORM_MAX
#undef LDBL_NORM_MAX
#define FLT_NORM_MAX    __FLT_NORM_MAX__
#define DBL_NORM_MAX    __DBL_NORM_MAX__
#define LDBL_NORM_MAX   __LDBL_NORM_MAX__

/* Whether each type matches an IEC 60559 format (1 for format, 2 for
   format and operations).  */
#undef FLT_IS_IEC_60559
#undef DBL_IS_IEC_60559
#undef LDBL_IS_IEC_60559
#define FLT_IS_IEC_60559        __FLT_IS_IEC_60559__
#define DBL_IS_IEC_60559        __DBL_IS_IEC_60559__
#define LDBL_IS_IEC_60559       __LDBL_IS_IEC_60559__

Boolean type:

The data type that can have two possible values: true or false. Include the "stdbool.h" header file in your program to use this data type. Example:

#include <stdio.h>
#include <stdlib.h>
#include <stdbool.h>
#include <string.h>

bool strtobool(const char* str) {
  if (strcmp(str, "true") == 0 || strcmp(str, "1") == 0) {
    return true;
  } else if (strcmp(str, "false") == 0 || strcmp(str, "0") == 0) {
    return false;
  } else {
    fprintf(stderr, "Invalid input: %s\n", str);
    exit(EXIT_FAILURE);
  }
}

int main() {
  char name[100];
  char answer[5];
  bool is_learning;

  printf("Enter your name: ");
  if(fgets(name, sizeof name, stdin) != NULL) {
    printf("Your name is %s\n", name);
  }

  printf("Are you learning C programming: ");
  if(fgets(answer, sizeof answer, stdin) != NULL) {
    is_learning = strtobool(answer);
    if (is_learning)
      printf("Very good, keep learning.\n");
    else
      printf("You should learn C programming.\n");
  }
  exit(EXIT_SUCCESS);
}

Void type:

Void is a keyword to use as a placeholder where you would put a data type, to represent "no data". The void data type represents the absence of type. Void types are used in the following situations:

to state that the specific type of the variable is not known yet (void pointers).
to state that the function returns no result (void functions).
to state that the function has no parameters.

IMPORTANT: We can't directly dereference void pointers because the compiler doesn't know the size or type of data being pointed to. We need to explicitly cast them to the correct type before dereferencing.

Here's an example to illustrate the usage of void type:

#include <stdio.h>
#include <stdlib.h>

// greet is a function that
// returns no result
// and takes an unkown/undefined list of arguments
void greet(){
  printf("Hello world\n");
}

void main(){
  // notice how we can pass all the arguments we want
  greet("foo", 3.14, true, 1);
  exit(EXIT_SUCCESS);
}

A function declared with the void data type can't return a value, we cannot use the return keyword inside a void function:

cscripts  >> gcc main.c && ./a.out 
main.c: In function ‘greet’:
main.c:57:10: warning: ‘return’ with a value, in function returning void
   57 |   return 5432;
      |          ^~~~
main.c:56:6: note: declared here
   56 | void greet(void){
      |      ^~~~~~

Another example:

#include <stdio.h>
#include <stdlib.h>

// greet is a function that
// returns an int
// and takes no arguments
int greet(void){
  return 5432;
}

void main(){
  // Because greet returns a result, I can retrieve and print it
  printf("Greet returned the value: %d\n", greet());
  exit(EXIT_SUCCESS);
}

If we try to pass an argument to the greet function, we will receive an error message:

cscripts  >> gcc main.c && ./a.out 
main.c: In function ‘main’:
main.c:57:44: error: too many arguments to function ‘greet’
   57 |   printf("Greet returned the value: %d\n", greet("foo"));
      |                                            ^~~~~
main.c:52:5: note: declared here
   52 | int greet(void){
      |     ^~~~~

Last example:

#include <stdio.h>
#include <stdlib.h>

void main(){
  int age = 5432;
  char* name = "Nsukami_";

  // foo is a pointer that point to a not yet knwon data type
  void* foo;

  // let's point to an integer
  foo = &age;
  // explicit cast to the correct type before dereferencing
  printf("age is %d\n", *((int*)foo));

  // let's point to something else, for ex, a string
  foo = &name;
  // explicit cast to the correct type before dereferencing
  printf("Name is %s\n", *((char**)foo));

  exit(EXIT_SUCCESS);
}

Composite types

Any data type which can be constructed using the language's primitive data types and other composite types.

Structures:

Structs (short for structure), a user-defined data type that allows us to group together related data items of different types. Those data items are known as members or fields.

#include <stdio.h>
#include <stdlib.h>

// An address is something with 2 attributes
// a city street AND an house number
typedef struct address {
  char citystreet[100];
  int housenumber;
} Address;

void main() {
  Address my_address = {
    "Planet 42, Rue de la Liberté",
    5432,
  };
  printf("I live in %s. My house number is: %u\n", my_address.citystreet, my_address.housenumber);

  // how to update one of the fields
  my_address.housenumber = 443;
  printf("I live in %s. My house number is now: %u\n", my_address.citystreet, my_address.housenumber);
}

Enum data type:

An enumeration is a special kind of data type defined by the user. An enumeration consists of a set of constant integers that are named by the user.

#include <stdio.h>

typedef enum Level {
  LOW = 1,
  MEDIUM,
  HIGH
} Level;

int main() {
  Level level = MEDIUM;

  switch (level) {
    case 1:
      printf("Low level");
      break;
    case 2:
      printf("Medium level");
      break;
    case 3:
      printf("High level");
      break;
    default:
      printf("Unkown level value");
  }
  exit(EXIT_SUCCESS);
}

We said earlier an enumeration is a special kind of data type because.

Unions:

An union is a user-defined data type that allows different data types to be stored in the same memory location. It enables us to create a variable that can hold different types of data, but only one type at a time.

#include <stdio.h>

// Struct that combines an enum and a union
typedef struct {
  // The type of our object could be either INT, either FLOAT, either STRING
  enum {
    INT,
    FLOAT,
    STRING
  } type;

  // all 3 values share the same memory location, only one can be set
  // value can be a int OR a float OR a string
  union {
    int intValue;
    float floatValue;
    char stringValue[20];
  } value;
} Object;

// Function to print the object based on its type
void print(Object o) {
  // check the type of our object 
  switch(o.type) {
    case INT:
      // then access the correct value
      printf("Integer: %d\n", o.value.intValue);
      break;
    case FLOAT:
      printf("Float: %.2f\n", o.value.floatValue);
      break;
    case STRING:
      printf("String: %s\n", o.value.stringValue);
      break;
  }
}

int main() {
  // Creating objects of different types
  Object o1, o2, o3;

  o1.type = INT;
  o1.value.intValue = 10;

  o2.type = FLOAT;
  o2.value.floatValue = 3.14;

  o3.type = STRING;
  strcpy(o3.value.stringValue, "Hello");

  // Printing the data objects
  print(o1);
  print(o2);
  print(o3);

  exit(EXIT_SUCCESS);
}

Arrays:

A type consisting of nonempty items of the same nature stored using contiguous memory location. The number of items (the size) never changes during the array lifetime. Example:

#include <stdio.h>
#include <stdlib.h>

int main() {
    float numbers[] = {2.5, 3.8, 4.2, 1.9, 5.6};
    int size = sizeof(numbers) / sizeof(numbers[0]);
    float sum = 0.0, average;

    for (int i = 0; i < size; i++) {
        sum += numbers[i];
    }

    average = sum / size;
    printf("Average: %.2f\n", average);

    exit(EXIT_SUCCESS);
}

Strings:

A string is just a sequence of characters stored inside an array and ending with the null character: '\0'. We don't have strings as we have them in Python programming.

#include <stdio.h>
#include <stdlib.h>

int main() {
  char name[100];
  printf("Enter your name: ");

  if(fgets(name, sizeof name, stdin) != NULL) {
    printf("Your name is %s\n", name);
  }
  exit(EXIT_SUCCESS);
}

To know what are the functions available for string manipulation, Python developers will type help(str):

Help on class str in module builtins:

class str(object)
 |  str(object='') -> str
 |  str(bytes_or_buffer[, encoding[, errors]]) -> str
 |
 |  Create a new string object from the given object. If encoding or
 |  errors is specified, then the object must expose a data buffer
 |  that will be decoded using the given encoding and error handler.
 |  Otherwise, returns the result of object.__str__() (if defined)
 |  or repr(object).
 |  encoding defaults to sys.getdefaultencoding().
 |  errors defaults to 'strict'.
 |
 |  Methods defined here:
 |
 |  __add__(self, value, /)
 |      Return self+value.
 |
 |  __contains__(self, key, /)
 |      Return bool(key in self).
 |
 |  __eq__(self, value, /)
 |      Return self==value.
 |
 |  __format__(self, format_spec, /)
:

C developers will look inside the <string.h> header file.

nskm2  >> find /usr/include -type f -name "string.h"
/usr/include/linux/string.h
/usr/include/tcl8.6/tcl-private/compat/string.h
/usr/include/string.h
nskm2  >>

They can also read the man pages, man string:

STRING(3)                                  Linux Programmer's Manual                                 STRING(3)

NAME
       stpcpy,  strcasecmp, strcat, strchr, strcmp, strcoll, strcpy, strcspn, strdup, strfry, strlen, strncat,
       strncmp, strncpy, strncasecmp, strpbrk, strrchr, strsep, strspn, strstr, strtok, strxfrm, index, rindex
       - string operations

SYNOPSIS
       #include <strings.h>

       int strcasecmp(const char *s1, const char *s2);
              Compare the strings s1 and s2 ignoring case.

       int strncasecmp(const char *s1, const char *s2, size_t n);
              Compare the first n bytes of the strings s1 and s2 ignoring case.

       char *index(const char *s, int c);
              Return a pointer to the first occurrence of the character c in the string s.

       char *rindex(const char *s, int c);
              Return a pointer to the last occurrence of the character c in the string s.

       #include <string.h>

       char *stpcpy(char *dest, const char *src);
              Copy a string from src to dest, returning a pointer to the end of the resulting string at dest.
 Manual page string(3) line 1 (press h for help or q to quit)

Constants:

A fixed value that cannot be, that should not be changed during the execution of a program.

If we have a fixed value that is used in calculations, we can define it as a constant to avoid hard-coding the value multiple times.

// let's define PI as a constant
const int PI = 3.14159;
int radius = 5;
// let's reuse our constant
int area = PI * radius * radius;

Trying to modify our constant will give us an error message:

main.c: In function ‘main’:
main.c:129:6: error: assignment of read-only variable ‘PI’
  129 |   PI = 4.5;
      |      ^

Another way to use constants is this one. Instead of using arbitrary numbers directly in your code, you can define them as constants with meaningful names to improve code readability and maintainability.

const int LENGTH = 3;
const int MIN_AGE = 18;
const int MAX_AGE = 50;

char input[LENGTH];
int age;

printf("Enter your age: ");
if(fgets(input, sizeof input, stdin) != NULL) {
  age = strtol(input, NULL, 10);
  if (age < MIN_AGE || age > MAX_AGE) {
    printf("%d is an invalid age! Please enter an age between %d and %d.\n", age, MIN_AGE, MAX_AGE);
  }
}

When declaring an array, we can use a constant to specify its size, making it easier to modify if needed.

// In this example, `ARRAY_SIZE` is declared as a constant to specify the size of an array. By using a
// constant, you can easily change the size of the array by modifying the constant's value in a single place.
const int ARRAY_SIZE = 10;
int arr[ARRAY_SIZE];

We can also declare function parameters as const to ensure that the function does not modify the parameter:

// This helps prevent accidental modifications and indicates that the function is read-only.
void print(const int arr[], int size) {
    for (int i = 0; i < size; i++) {
        printf("%d ", arr[i]);
    }
}

Conclusion:

I really hope you've learned something. I did \o/ In the next episode, I will try to write about header files or pointers or something else depending on the current moon, the weather and the stars alignment. Also, thanks to Orbifx for taking the time to answer all my questions.

On a completely differnt note:

C

ToC