Makefiles, Best Practices

Nov 29, 2018

9 minute read

Hacker News

Makefiles are widely used to build a lot of languages and projects, with C/C++ projects being the majority. Whenever you are developing or testing software, it’s highly probable that you will encounter them.

This post will try to address some common errors in Makefiles, as well as good practices and cross-compiling support.

Prerequisites: good understanding of Makefiles, UNIX directory hierarchy and compilation process.

Make versions

POSIX Make standard syntax¹ is often extended by the different implementations. In this post we will use GNU Make² (gmake) and its extensions.

Each Make dialect searches for a specific file when executed and fallbacks to a general Makefile if it doesn’t succeed; for GNU Make, this file is GNUMakefile. When using a dialect over the POSIX interface, consider to name the Makefile accordingly; this way it’s clear which implementation is needed.

In most Linux systems and OSX systems, make is a symlink to gmake. You can check the version of make on your system by running:

$ make --version
GNU Make 4.2.1
Built for x86_64-pc-linux-gnu
Copyright (C) 1988-2016 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

For simplicity, from now on we will refer to GNUMakefile and gmake as respectively Makefile and make.

Variable setting

In this post we will use two ways (of the five available³) to set a variable in Makefile. Here is a recap from StackOverflow⁴:

Lazy Set: Normal setting of a variable — values within it are recursively expanded when the variable is used, not when it’s declared

VARIABLE = value

Immediate Set: Setting of a variable with simple expansion of the values inside — values within it are expanded at declaration time.

VARIABLE := value

Set If Absent: Setting of a variable only if it doesn’t have a value

VARIABLE ?= value

Compiler

CC ?= gcc
LD ?= gcc

While it is usually safe to assume that sensible values have been set for CC and LDD, it does no harm to set them if and only if they are not already set in the environment, using the operator ?=.

Using the assignment operator = will instead override CC and LDD values from the environment; it means that we choose the default compiler and it cannot be changed without editing the Makefile or appending these variables in the command line. This leads to two problems:

The user has set CC=clang in the environment but gcc will be used anyway, even if it isn’t installed.
A cross-compile environment has set CC to a link of the actual target architecture compiler, like for example arm-pc-linux-cc, but gcc of the host will be used.

If the user has acknowledged these problems (i.e. the compilation fails because gcc is not installed), he can append CC=clang to the make call:

$ make CC=clang

This solution works as intended (clang compiles the sources), whatever the Makefile uses = or ?= operators but it adds workload on the user: a reading of the Makefile is needed to check the variables to append on the command line. It’s also error-prone because it’s easy to skip one variable assignment in a large project while it is safe to assume that the environment already contains correct values.

For these reasons, this solution is considered sub-optimal, especially for package maintaining, and therefore its usage is discouraged.

Compiler flags

make utility also use variables that are defined by implicit rules⁵ and between these variables, some define extra build flags:

CFLAGS: flags for the C compiler
CXXFLAGS: flags for the C++ compiler
CPPFLAGS⁶: for preprocessor flags for C/C++ and Fortran compilers

Note: there is a variable named CCFLAGS that some projects are using; it defines extra flags for both the C/C++ compilers. This variable is not defined by the implicit rules, please avoid it if you can.

Note 2: build systems usually follow the make implicit rules for both variable naming and meaning. In other words, defining CFLAGS means defining extra flags for C compiler whatever build system you’re using.

Usually, we add to the compiler options specific for the application we’re writing, such as language revision (do we want to use c89 or c99?). Then the user add his own CFLAGS/CXXFLAGS to include debug option and add optimizations; it is important to add these user defined flags.

We would be tempted to do:

CFLAGS = -ansi -std=99

But this would discard the environment CFLAGS, which might contain user defined value. You should instead do:

CFLAGS := ${CFLAGS} -ansi -std=99

Note: The Immediate set (:=) is used because the Lazy set (=) would result in a recursive loop.

Or, if you have a long CFLAGS:

CFLAGS += -ansi -std=99

In this last example, we append our values to environment CFLAGS; if it isn’t defined, it will be expanded to no text. The resulting CFLAGS will still be a recursively expanded variable⁷.

We can optimize a little by converting CFLAGS from a recursive variable to a simple one.

CFLAGS := ${CFLAGS}
CFLAGS += -ansi -std=99

Libraries

In order to include the libraries in the program, gcc flags are needed for both compile and link time.

You can add default values when including libraries, like for example default headers location /usr/include/; use this value inside a variable that can be overriden from the environment (?= set) if you follow this approach.

Suppose we want to include and link our program against OpenSSL, a broadly used library for cryptography and TLS/SSL protocols; we would intuitively add -I/usr/include/openssl to our CFLAGS/CXXFLAGS. It might be good for the most of the linux system, but a MacOS user could have OpenSSL headers in /usr/local/include/openssl, breaking compilation, same thing goes for cross compilation.

What it should be done instead is:

OPENSSL_INCLUDE ?= -I/usr/include/openssl
OPENSSL_LIBS ?= -lssl -lcrypto

CFLAGS ?= -O2 -pipe
CFLAGS += $(OPENSSL_INCLUDE)
LIBS := $(OPENSSL_LIBS)

While this approach result in successful compilation, by overriding values when needed, it is really cumbersome and error-prone. A better way to include external libraries is to use pkg-config.

pkg-config

pkg-config⁸ is a command-line tool that provides correct compiler options when including libraries; it is widely used in Makefiles but also in various build systems as CMake⁹ and meson¹⁰.

Let’s rewrite the previous snippet using pkg-config:

PKG_CONFIG ?= pkg-config

CFLAGS ?= -O2 -pipe
CFLAGS += -std=99
CFLAGS += $(shell ${PKG_CONFIG} --cflags openssl)
LIBS := $(shell ${PKG_CONFIG} --libs openssl)

Note how pkg-config executable can be overriden from the environment, again for supporting cross-compiling.

Immediate set should be used with LIBS to avoid spawning pkg-config every time the variable is evaluated. Thanks to u/dima55¹¹ for the tip.

Miscellaneous

Other executables often used when compiling are ar, ranlib and as; don’t call these executables directly but store their name in variables and use these variables instead.

AR ?= ar
RANLIB ?= ranlib
AS ?= as

From make documentation¹²:

The precise recipe is ${AS} ${ASFLAGS}.

Installation

The last part of a Makefile is the installation of program itself and the related data. This is the trickiest part because there a lot of data types, and each one can be installed in various locations.

PREFIX

Before talking about the different components to install, we should discuss briefly the PREFIX variable. The binaries usually go in a bin directory, in Linux environment /usr/bin is used for system packages managed by the package managers and /usr/local/bin for system packages installed by the user, FreeBSD ports use /usr/local/bin; same goes for the other components (data, man pages).

It is important to specify which PREFIX the bin directory should be installed in:

PREFIX ?= /usr/local

The default value should be /usr/local¹³.

BINDIR

The principal component of every program (excluding libraries) is the executable itself. As we stated before, the usually directory is ${PREFIX}/bin, so at a first glance we could choose to install the executables directly there like this:

install:
    @cp -p foo ${PREFIX}/bin/foo

While this is correct most of the time, it limits flexibility of the Makefile we’re writing. To give user more choice we should instead use a variable to specify executables location: BINDIR.

BINDIR ?= ${PREFIX}/bin

install:
    @cp -p foo ${BINDIR}/foo

DATADIR

From the GNU Coding Standards¹⁴:

The directory for installing idiosyncratic read-only architecture-independent data files for this program.

DATADIR directory, usually /usr/share contains read-only data, from application icon to man pages and documentation.

DATADIR ?= ${PREFIX}/share

It is important to specify DATADIR because MANDIR and other variables depend on it.

It can happen that this directory doesn’t start with PREFIX; for example a cross-compiled system can have the executables in /usr/${ARCH}/bin (therefore /usr/${ARCH} as PREFIX) and have /usr/share as DATADIR.

MANDIR

The location where the man pages are installed is stored in MANDIR.

If you are using DATADIR:

MANDIR ?= ${DATADIR}/man

Otherwise:

MANDIR ?= ${PREFIX}/share/man

The first approach is better because a user can just set DATADIR and forget about MANDIR; they are equally flexible.

DESTDIR

Now, we chose all the various component location, there is one last important question to ask ourself: which is the base install directory?

We usually want to install the program in root folder which works with everything we’ve seen so far. Let’s suppose there is a system mounted in whatever place or the toolchain merges the a special destination directory with the root install, how can we target this custom destination directory?

DESTDIR is a special variable that let us specify a destination directory. Its usage is simple, we prefix all the variables we saw with ${DESTDIR} in the install section.

install:
    @cp -p foo ${DESTDIR}${BINDIR}/foo
    @cp -p foo ${DESTDIR}${MANDIR}/man1

There is no need to set it because either the user set DESTDIR or / will be used.

Note that the following assignment to BINDIR is incorrect because DESTDIR should only be added later in the install section and not on variable assignment.

BINDIR = ${DESTDIR}${PREFIX}/bin

Other locations

Since there are many other locations, we covered the most used ones. Refer to GNU Coding Standard¹⁴ and big projects’ Makefile to see other variables.

Reproducible builds

Another practice, which got an increasing usage in the last years, is reproducible builds¹⁵:

Reproducible builds are a set of software development practices that create an independently-verifiable path from source code to the binary code used by computers.

This practice is being adopted by Debian¹⁶ and it is worth adopting in general. Explaining how to adopt reproducible builds goes beyond the scope of this post, but you can read more in the official site¹⁵.

Conclusion

Following these rules and standards you will have a better Makefile, along with great flexibility among all the user environments and cross-compiling support.

danyspin97's site