C programming language

Robert Crowther Apr 2022
Last Modified: Feb 2023

What is it?

A programming language.

What it’s made from

There are a few versions of compilers. Some of the main ones are Intel—from the company who make microprocessors, GCC—from the GNU, and Clang—a C compiler which uses the LLVM compiler‐creation pipeline. Often, these compilers are usually written in C itself.

Where it can run

C can run on anything that has a compiler for it. And there are compilers for nearly any computer any normal person can think of. Indeed, some compilers can handle nearly any computer you can think of.

The C language is often described as ‘cross‐platform’. In the early days, this was an important claim and mostly true. In practice, numerous extensions and suggestions mean code written for one compiler and platform may not work on another. However, if you stick to the core ideas, the C language remains fundamentally platform‐free and mostly portable. Note that C needs to be compiled, it will not run on an ‘engine’.

Who looks after it?

As right, nobody. At first, people used it. Later they referred to the original book on the subject. After that, a body called ANSI published specifications (e.g. ‘C99’) that most people stick to. For a long time, the International Organization for Standardization (ISO) have published standards. Hence “C89” and so forth.

History

In the beginning, C was developed to code the UNIX operating system. So the language needed to be light in weight, to facilitate a lot of writing, to be accurate about what it was doing, and to have sympathy with low‐level mechanisms.

To create C, the makers started with a language called BCPL (hence the reduction of the name to ‘c’). BCPL was written to work as an intermediate language, to sit between compilers and more fluent front ends. BCPL was admired for it’s elegance and the fact it could be implemented in less than 16k, but had run into issues over a few features. The makers of C threw out these features, structured with some of their own research, so created C.

At first C was regarded as an oddity, and not an archetypal or solid computer language. It was also viewed a a high‐level language. This does not seem to be well‐remembered. But, over time, the status of C has changed. It stopped being a near‐scripting language, and became the goto language for handling low‐level computer interaction. Other similar languages that could have done the job have retreated into niches, or vanished altogether. C stays put.

Licence

The standard id open. Tool licensing depends on the tool. I recall Intel has some proprietary aspects, and Microsoft gear costs money—Visual Studio. The GNU compiler, GCC, is Free Software. The products can be any licence you choose.

Install

Here I mainly refer to GCC. Much the same for most compilers.

If you’re on a UNIX, a C compiler may not be pre‐bundled but should be available in packages (e.g. Debian’s ‘Build Dependencies’ includes the GCC compiler and plenty more). If you’re on Microsoft, your C compiler is in Visual Studio. So I suppose you would say install was easy. But I will throw in that compiling a modern C compiler is a horror.

Internal structure

C is a compiled language. There have been I recall an interpreter, but never much used.

The compilers are the usual classic of lexers, syntax trees, intermediate languages, code generators, etc. They are usually highly optimised, with ornate internals. C compilers usually generate ‘object files’ containing machine code for execution, there is no layer looking after this.

In use

As it is minimal, the language is generally simple to look at and easy to understand. It can become tricky when preprocessor commands appear, because those usually use code‐culture argot. The same can be said for large builds, when it can be difficult to find what is happening where. C is not worse than other languages in this, but not good, either.

As for building code, depends how you look at it. Fundamentally, it’s easy to compile C code. You can write a small program in a text file, compile with one line, and it’s done. But, in practice, making even a small, but viable, C program is a nightmare. Most of the struggle is the compilers and the build tools. The compilers need vast strings of configuration to locate code libraries and set their zillions of compile parameters. The build tools are something else—vast globs of stuff with their own conventions and tweaky setup.

There a mass of checking/profiling/performance tools available—anything you can think of has been written for C. Well, maybe not a formal proof engine.

What does it look like (configuration)?

The language looks like this,

#include <stdio.h>

int main() {
    if (a > 5) {
        printf('a was greater than 5!');
    }
    else {
        printf('a was less than than 5! a:'%d', a);
    }

    return 0;
}

May not seem like much now, but that was special. Note the if…else construct, which was a C original—someplace there’s theoretical papers on that. Note the brackets used for delimiting sections of code. And the placement of commands outside the brackets for parameters. This was a big deal at the time, inherited by many languages since. Also worth pointing out, even for printing, an import is needed—C core is designed to be small.

What does it look like (output)?

C can make anything you can think of. More to the [point, it probably has done. After all, this is (mostly) what the UNIX operating system is coded in. GUI tools, drivers for computer cards, databases, you name it. There’s a continued undercurrent of what is near‐insult in descriptions of C, saying it is for ‘low‐level’ coding. That said, some value C for it’s portability so, fifty years later, big programs are still being built in C. Though, especially for big projects, other languages may be faster to develop, more robust to build, more rigorous for test, and easier to maintain.

Project policy

There are volumes of books on how to use C, where to use it, and so on. But there are no obligations, or even formal reference work. Coders and compilers stick to a standard or not. Mostly the standards organisations seem to have stuck to backwards‐compatibility, while cautiously introducing a few extensions to handle new computer features.

Getting started

The compilers, capable of working at low‐level, tuned precisely, seeking libraries on systems, are a nightmare. The build environments, made to handle every situation and computer you can think of, are a similar terror. You are probably best starting with an IDE that sets most of a build and dependencies for you. That said, a first compile with GCC can be as simple as,

gcc -Wall -o test test.c
./test
rm test

But this simplicity has been lost in the culture. And if you use this as a start, you will soon be demanding more—and not getting it.

Extending

Umm, if you are reading this article, you will not be hacking a compiler to make C ‘different’, or ‘better’.

Libraries

Whatever you can think of. Bear in mind that swathes of the low‐levels of operating systems are written in C, so there is a library for anything.

Using a library, getting it onto a computer, connecting to a program, then calling it, is a royal pain. Originally, the ideas were simple and neat. But once you’ve waded through OS placement, header files, shared libraries, version handling, build environment recognition, compiler dependencies and all the rest of it, you’ll be yearning for modern solutions.

Deployment

Now that is simple. Anything you compile in C will finally be a low‐level executable ‘object file’. The kind of thing that your operating system launches all over the place to get started. You only need to provoke an object file, however the OS provokes executable files.

Documentation

The classic is the original book. Otherwise, you will find everything from posts on the basics to a shelf of books. The real problem is elsewhere, getting a project of any size to build and run.

MIA

There’s a few.

Some people have on and off complained that it is a shame the C preprocessor is limited. There is a reason, because it was a very light addition for imports and enums. It was, I recall, made to discourage advanced use. It was also good for a few inline additions, and has found a use manipulating headers for varying environments. With a generalised preprocessor, C could do a lot of useful and clever stunts with code. But that would be at a cost of obscure code, and arguably introduce a field of incompatibility.

Like other minimal languages such as Javascript, C has no namespacing. The reason for this is that namespacing can be provided by other methods, such as convention or use of the preprocessor. Like Javascript, the lack of namespacing in C produces obscure commands or elaborate naming systems, which have namespacing as their purpose.

There’s a solid syntactical purpose for the semi‐colons at line‐ends, but they are prone to errors. Most modern computer languages, even if they are C‐like in other ways, have dropped them. Notably, Javascript used semi‐colons to finish lines, but has tried to abandon them.

Many other comments are simply a wish that C was more sophisticated. But that would have denied the small core of the language, which is a large appeal, especially if you need to develop for something like an electronic circuit, as opposed to a full computer. I’ll mention two, though…

Lack of typing for functions, pointers, and void* returns. If you are interested, you need to look these issues up—you will find plenty. It comes down to this—C is strongly and precisely typed for numerics, but is all but free‐typed for other data. And this can cause caos, because these other actions and objects are things where a typecheck would be most welcome.

The lack of any object superior to a ‘struct’ has caused elaborate systems to be built on top of C. These vary. Some are home‐rolled conventions. Then there is GObject, made by Gnome to power GTK desktop software. After that, separate languages like Objective C and C++. That said, these extensions increase the core size of C considerably.

Examples of use

How about the entire Linux core, the GTK desktop, and the bulk of Microsoft system and tools until about Windows 8?

Off‐hand facts

Weird API

There’s a few freaks. Probably the classic is that there are some ambiguities in the way C can be written. But, my opinion, C is low on these kinds of problems.

In the wider environment, there is the case of header files. Header files are small files, usually associated by name with a C file, either C source or compiled code. Header files carry information on the public information in the C file. Header files allow easy external assessment of codebases. In the case of pre‐compiled code libraries, these files are necessary, because otherwise users would not know what is in the lump of code. However, as systems came to rely on header files, the culture became rigorous, refusing to function without headers. At the same time C header files are slack or, shall we say, a lightweight version of what can be done. To protect larger systems they became crusted with weird preprocessor qualifications and defence code. The result is that a modern C header file is riddled with dirty code. As well as being a an ugly repetition of what is in a source file. It’s no surprise that modern Just‐In‐Time scripting languages have ditched them. C header files, especially, are nothing but pain—jargon to write, a source of obscure errors and a maintenance burden.

Weird behaviour

While C compilers are too sophisticated and mature to deliver ‘wrong’ results, there is an issue or two that is part of spec. Perhaps the largest is null‐terminated strings. Null‐terminated strings introduce security holes that have been endlessly discussed. They have a fundamental type issue (is the NUL a character?). Yet they are neat, can recover, and are of infinite length. Sixty years on and there is no verdict.

Performance

Fast. As fast as near any other language of note. We are talking orders of faster than any scripting language, In general an order faster than Java. Of non‐esoteric languages, only Ada and C++ are in the same field. Two contributing factors are that C is close to the underlying system anyway, and the compilers represent a monument on the scale of Stonehenge.

Summary

Pros,

Cons,

Conclusion

C was designed as a minimal language capable of low system interaction. It was a full‐scale language though, for writing an operating system without using machine code. Though not solid and respectable, like COBOL for example, C was seen as a high level language. As the world moved on C remained as a language for projects that need to be close to the machine. where speed and poking is everything. In the course of this change, C code needed to be bulwarked with build tools. Due to the minimal core, these support structures have become Gothic constructions. These kinds of fortress codebases are, arguably, not the best distribution of effort, and lead to horrible API.

I know this is a big ‘if’, but I feel much has been lost with C. Imagine if the ambiguities in syntax had been fixed, and namespacing introduced. Some would, at this point, raise C++, but I suggest we hold close to the original compact core. Then keep going and do something about the semi‐colons, type the pointers, and do something about the crusted header files. And all the time insist on the minimal compilers of the original BCPL. We’d have a hot‐rod Javascript, strong‐typed, easy to read, and massively portable. As it is, if you need to go down to the system, or need long‐lasting code that can be used by other codebases, C (or C++) has been the language for forty years. But the fun is long‐dead.

References

Wikipedia, lean and informative entry,

https://en.wikipedia.org/wiki/C_(programming_language

The International Obfuscated C Code Contest. More of this, I suggest, may have been good for C,

https://en.wikipedia.org/wiki/International_Obfuscated_C_Code_Contest