Makefiles, Best Practices
Nov 29, 2018
9 minute read

    Makefiles are widely used to build a lot of languages and projects, with C/C++ projects being the majority. Whenever you are developing or testing software, it’s highly probable that you will encounter them.

    This post will try to address some common errors in Makefiles, as well as good practices and cross-compiling support.

    Prerequisites: good understanding of Makefiles, UNIX directory hierarchy and compilation process.

    Make versions

    POSIX Make standard syntax1 is often extended by the different implementations. In this post we will use GNU Make2 (gmake) and its extensions.

    Each Make dialect searches for a specific file when executed and fallbacks to a general Makefile if it doesn’t succeed; for GNU Make, this file is GNUMakefile. When using a dialect over the POSIX interface, consider to name the Makefile accordingly; this way it’s clear which implementation is needed.

    In most Linux systems and OSX systems, make is a symlink to gmake. You can check the version of make on your system by running:

    $ make --version
    GNU Make 4.2.1
    Built for x86_64-pc-linux-gnu
    Copyright (C) 1988-2016 Free Software Foundation, Inc.
    License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
    This is free software: you are free to change and redistribute it.
    There is NO WARRANTY, to the extent permitted by law.
    

    For simplicity, from now on we will refer to GNUMakefile and gmake as respectively Makefile and make.

    Variable setting

    In this post we will use two ways (of the five available3) to set a variable in Makefile. Here is a recap from StackOverflow4:

    Lazy Set

    Normal setting of a variable — values within it are recursively expanded when the variable is used, not when it’s declared

    VARIABLE = value
    
    Immediate Set

    Setting of a variable with simple expansion of the values inside — values within it are expanded at declaration time.

    VARIABLE := value
    
    Set If Absent

    Setting of a variable only if it doesn’t have a value

    VARIABLE ?= value
    

    Compiler

    CC ?= gcc
    LD ?= gcc
    

    While it is usually safe to assume that sensible values have been set for CC and LDD, it does no harm to set them if and only if they are not already set in the environment, using the operator ?=.

    Using the assignment operator = will instead override CC and LDD values from the environment; it means that we choose the default compiler and it cannot be changed without editing the Makefile or appending these variables in the command line. This leads to two problems:

    • The user has set CC=clang in the environment but gcc will be used anyway, even if it isn’t installed.
    • A cross-compile environment has set CC to a link of the actual target architecture compiler, like for example arm-pc-linux-cc, but gcc of the host will be used.

    If the user has acknowledged these problems (i.e. the compilation fails because gcc is not installed), he can append CC=clang to the make call:

    $ make CC=clang
    

    This solution works as intended (clang compiles the sources), whatever the Makefile uses = or ?= operators but it adds workload on the user: a reading of the Makefile is needed to check the variables to append on the command line. It’s also error-prone because it’s easy to skip one variable assignment in a large project while it is safe to assume that the environment already contains correct values.

    For these reasons, this solution is considered sub-optimal, especially for package maintaining, and therefore its usage is discouraged.

    Compiler flags

    make utility also use variables that are defined by implicit rules5 and between these variables, some define extra build flags:

    • CFLAGS: flags for the C compiler
    • CXXFLAGS: flags for the C++ compiler
    • CPPFLAGS6: for preprocessor flags for C/C++ and Fortran compilers

    Note: there is a variable named CCFLAGS that some projects are using; it defines extra flags for both the C/C++ compilers. This variable is not defined by the implicit rules, please avoid it if you can.

    Note 2: build systems usually follow the make implicit rules for both variable naming and meaning. In other words, defining CFLAGS means defining extra flags for C compiler whatever build system you’re using.

    Usually, we add to the compiler options specific for the application we’re writing, such as language revision (do we want to use c89 or c99?). Then the user add his own CFLAGS/CXXFLAGS to include debug option and add optimizations; it is important to add these user defined flags.

    We would be tempted to do:

    CFLAGS = -ansi -std=99
    

    But this would discard the environment CFLAGS, which might contain user defined value. You should instead do:

    CFLAGS := ${CFLAGS} -ansi -std=99
    

    Note: The Immediate set (:=) is used because the Lazy set (=) would result in a recursive loop.

    Or, if you have a long CFLAGS:

    CFLAGS += -ansi -std=99
    

    In this last example, we append our values to environment CFLAGS; if it isn’t defined, it will be expanded to no text. The resulting CFLAGS will still be a recursively expanded variable7.

    We can optimize a little by converting CFLAGS from a recursive variable to a simple one.

    CFLAGS := ${CFLAGS}
    CFLAGS += -ansi -std=99
    

    Libraries

    In order to include the libraries in the program, gcc flags are needed for both compile and link time.

    You can add default values when including libraries, like for example default headers location /usr/include/; use this value inside a variable that can be overriden from the environment (?= set) if you follow this approach.

    Suppose we want to include and link our program against OpenSSL, a broadly used library for cryptography and TLS/SSL protocols; we would intuitively add -I/usr/include/openssl to our CFLAGS/CXXFLAGS. It might be good for the most of the linux system, but a MacOS user could have OpenSSL headers in /usr/local/include/openssl, breaking compilation, same thing goes for cross compilation.

    What it should be done instead is:

    OPENSSL_INCLUDE ?= -I/usr/include/openssl
    OPENSSL_LIBS ?= -lssl -lcrypto
    
    CFLAGS ?= -O2 -pipe
    CFLAGS += $(OPENSSL_INCLUDE)
    LIBS := $(OPENSSL_LIBS)
    

    While this approach result in successful compilation, by overriding values when needed, it is really cumbersome and error-prone. A better way to include external libraries is to use pkg-config.

    pkg-config

    pkg-config8 is a command-line tool that provides correct compiler options when including libraries; it is widely used in Makefiles but also in various build systems as CMake9 and meson10.

    Let’s rewrite the previous snippet using pkg-config:

    PKG_CONFIG ?= pkg-config
    
    CFLAGS ?= -O2 -pipe
    CFLAGS += -std=99
    CFLAGS += $(shell ${PKG_CONFIG} --cflags openssl)
    LIBS := $(shell ${PKG_CONFIG} --libs openssl)
    

    Note how pkg-config executable can be overriden from the environment, again for supporting cross-compiling.

    Immediate set should be used with LIBS to avoid spawning pkg-config every time the variable is evaluated. Thanks to u/dima5511 for the tip.

    Miscellaneous

    Other executables often used when compiling are ar, ranlib and as; don’t call these executables directly but store their name in variables and use these variables instead.

    AR ?= ar
    RANLIB ?= ranlib
    AS ?= as
    

    From make documentation12:

    The precise recipe is ${AS} ${ASFLAGS}.

    Installation

    The last part of a Makefile is the installation of program itself and the related data. This is the trickiest part because there a lot of data types, and each one can be installed in various locations.

    PREFIX

    Before talking about the different components to install, we should discuss briefly the PREFIX variable. The binaries usually go in a bin directory, in Linux environment /usr/bin is used for system packages managed by the package managers and /usr/local/bin for system packages installed by the user, FreeBSD ports use /usr/local/bin; same goes for the other components (data, man pages).

    It is important to specify which PREFIX the bin directory should be installed in:

    PREFIX ?= /usr/local
    

    The default value should be /usr/local13.

    BINDIR

    The principal component of every program (excluding libraries) is the executable itself. As we stated before, the usually directory is ${PREFIX}/bin, so at a first glance we could choose to install the executables directly there like this:

    install:
        @cp -p foo ${PREFIX}/bin/foo
    

    While this is correct most of the time, it limits flexibility of the Makefile we’re writing. To give user more choice we should instead use a variable to specify executables location: BINDIR.

    BINDIR ?= ${PREFIX}/bin
    
    install:
        @cp -p foo ${BINDIR}/foo
    

    DATADIR

    From the GNU Coding Standards14:

    The directory for installing idiosyncratic read-only architecture-independent data files for this program.

    DATADIR directory, usually /usr/share contains read-only data, from application icon to man pages and documentation.

    DATADIR ?= ${PREFIX}/share
    

    It is important to specify DATADIR because MANDIR and other variables depend on it.

    It can happen that this directory doesn’t start with PREFIX; for example a cross-compiled system can have the executables in /usr/${ARCH}/bin (therefore /usr/${ARCH} as PREFIX) and have /usr/share as DATADIR.

    MANDIR

    The location where the man pages are installed is stored in MANDIR.

    If you are using DATADIR:

    MANDIR ?= ${DATADIR}/man
    

    Otherwise:

    MANDIR ?= ${PREFIX}/share/man
    

    The first approach is better because a user can just set DATADIR and forget about MANDIR; they are equally flexible.

    DESTDIR

    Now, we chose all the various component location, there is one last important question to ask ourself: which is the base install directory?

    We usually want to install the program in root folder which works with everything we’ve seen so far. Let’s suppose there is a system mounted in whatever place or the toolchain merges the a special destination directory with the root install, how can we target this custom destination directory?

    DESTDIR is a special variable that let us specify a destination directory. Its usage is simple, we prefix all the variables we saw with ${DESTDIR} in the install section.

    install:
        @cp -p foo ${DESTDIR}${BINDIR}/foo
        @cp -p foo ${DESTDIR}${MANDIR}/man1
    

    There is no need to set it because either the user set DESTDIR or / will be used.

    Note that the following assignment to BINDIR is incorrect because DESTDIR should only be added later in the install section and not on variable assignment.

    BINDIR = ${DESTDIR}${PREFIX}/bin
    

    Other locations

    Since there are many other locations, we covered the most used ones. Refer to GNU Coding Standard14 and big projects’ Makefile to see other variables.

    Reproducible builds

    Another practice, which got an increasing usage in the last years, is reproducible builds15:

    Reproducible builds are a set of software development practices that create an independently-verifiable path from source code to the binary code used by computers.

    This practice is being adopted by Debian16 and it is worth adopting in general. Explaining how to adopt reproducible builds goes beyond the scope of this post, but you can read more in the official site15.

    Conclusion

    Following these rules and standards you will have a better Makefile, along with great flexibility among all the user environments and cross-compiling support.




    Comments