How to create processes

The fork() and exec() system calls.

We've seen processes, We've said lots of things about processes, but we did not talk about process creation. Let's see how to create a process in C, in Golang, and in Rustlang Disclaimer: I'm not doing a comparison to find out which one is better than the other. I just want to explore how the same task can be done using different approaches.

Fork and Exec:

UNIX processes are created using the fork() system call. Fork is the primary method of process creation on Unix-like operating systems. Fork is the system call that the parent process uses to "divide" itself, to "fork" into two identical processes. The child process will be the exact copy of the parent except for the return value.

The exec() system call and its family of functions are almost always used with fork(). The exec() syscall and the like overwrite the current [stackframe]({{< ref on-processes >}})with the name of the application passed to it.

C:

In C, there is nothing special to do, contrary to Golang and Rustlang. You just really call fork(), then exec(), simple and direct. The following, is an example written in C:

#include <stdio.h>
#include <sys/types.h>
#include <unistd.h>
#include <stdlib.h>
#include <sys/wait.h>
#include <string.h>

void fork_and_execute();

int main() {
  fork_and_execute();
  return 0;
}

void fork_and_execute() {
  int pid, status;
  pid = fork();

  // if pid equals 0,
  if(pid == 0){
    // we're dealing with the child process
    printf("I am the child process, my pid is %d\n", getpid());

    // the program we want to execute
    char *cmd[5] = {"ls", "-a", "-l", "-h", '\0'};

    // execvp will search for ls on $PATH
    if (execvp(*cmd, cmd) < 0){
      // catch any error that may occur during execution
      printf("*** ERROR: exec failed\n");
      perror(*cmd);
      exit(EXIT_FAILURE);
    }
  }

  // returned pid is -1, fork operation failed
  if(pid < 0){
    printf("*** ERROR: forking child process failed\n");
    perror("fork");
    exit(EXIT_FAILURE);
  }

  // The wait system-call blocks the parent process
  // and waits for the child-process to end.
  while (wait(&status) != pid){}
}

Go:

In Go, it is recommended to use the exec package. It runs external commands, and wraps os.StartProcess function for you. Following, is the previous example, this time written in Golang:

package main

import "syscall"
import "os"
import "os/exec"

func main() {
    // check if ls command exists in the PATH
    binary, lookErr := exec.LookPath("ls")
    if lookErr != nil {
        panic(lookErr)
    }

    // the program we want to execute
    args := []string{"ls", "-a", "-l", "-h"}

    // fork new process and execute our program
    execErr := syscall.Exec(binary, args, os.Environ())

    // catch error if any
    if execErr != nil {
        panic(execErr)
    }
}

Rust:

In Rust, we have std::process::Command. It is is a type that acts as a process builder and provides fine-grained control over how the new process should be spawned. If we browse the source code, we can read beginning line 521:

Constructs a new Command for launching the program at

path program, with the following default configuration:

  • No arguments to the program
  • Inherit the current process's environment
  • Inherit the current process's working directory
  • Inherit stdin/stdout/stderr for spawn or status, but create pipes for output

Basically, it behave a little bit like a call to fork(), followed by a call to exec(). Let's rewrite the previous example in Rust:

use std::process::Command;

fn main() {
    let output = Command::new("ls")
                     .arg("-a")
                     .arg("-l")
                     .arg("-h")
                     .output()
                     .expect("ls command failed to start");

    println!("stdout: {}", String::from_utf8_lossy(&output.stdout));
}

The output function executes the command as a child process, waiting for it to finish and collecting all of its output. By default, stdin, stdout and stderr are captured (and used to provide the resulting output).

The expect function is used for quick and dirty error handling.


Strace, trace system calls and signals:

Somewhere in the strace's manual page, we can read:

Trace child processes as they are created by currently traced processes as a result of the fork(2), vfork(2) and clone(2) system calls.

If everything went normally during the execution of our scripts, we should see, in the strace's output, lots of system calls, the most important ones being a call to clone(), followed by a call to exec():

...
4714  clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fac5d28da90) = 4715
...
7564  execve("/bin/ls", ["ls", "-a", "-l", "-h"], [/* 74 vars */] <unfinished ...>
...

fork() was the original UNIX system call. Only used to create new processes, not threads. Also, it is portable.

clone() is a new, versatile system call which can be used to create, depending on the passed option, a UNIX process, or a POSIX thread, or something in between, or something completely different. I invite you to Read The Fabulous Manual about clone().

I don't really know if there is something else you can read to learn more on this topic: