[html,12pt,a4]article

C
safe-c
[2]#1#2

int
long
char




[1]#1
[1]"#1"
---
<
>

htmlonly
[1]#1
[1]"#1"
[1]#1
 - 
<
>
htmlonly






The  Standard Library
Last Modified: 29th March 1996
Barry McMullin


Note that this complete document is available in
LaTexstdlib.tex and plain
ASCIIstdlib.txt forms, to allow downloading and
offline browsing.

Introduction

This document is one component of the hypermedia documentation for the
course Software Engineering 1
http://www.eeng.dcu.ie/7Emcmullin/swe1/swe1root/swe1root.html. It
presents more detailed notes on the specific topic of the 
Standard Library.

On Re-inventing Wheels

It is well known that one should not waste time re-inventing the
wheel. In Engineering, this means not redesigning something one
has already got a perfectly satisfactory design for. In Software
Engineering, it means not rewriting software to perform
operations that one has previously written (and tested!) software
for.

In the case of the  language there are a range of things that
programmers very frequently wish to do, which are so common that
standard software for the purpose is actually distributed along
with every  compiler. This software is called the  
Standard Library. You are more or less guaranteed that every
 compiler will come with an implementation of the Standard
Library. It is important to be aware at least of the existence of
the Standard Library, and to have some outline idea of the
software contained in the Library. In this way you can avoid
unnecessarily writing and testing software which is already
effectively available, in a well tested form, for free.

The recommended compiler for the current course is 
DJGPP
http://www.delorie.com/djgpp, which is a port
(version) of the GNU C Compiler for the PC
platform.  Detailed information on the
 Standard Library (Version 1.12) provided with
this compiler is available in the
Web-based documentation
http://www.delorie.com/djgpp/doc/libc-1.12/libctoc.html.
You should consult this
whenever you need details of the full range of
functionality of the Standard Library. 

On the other hand, this essay is intended
only to provide an introduction to a very small subset of the
Standard Library, and is not a substitute for full DJGPP
documentation.

Mechanics

In order to use the facilities of the Standard Library, one must
have some understanding of the mechanics of how the compiler will
access it.

Some very simple  programs can be
completely
self-contained: all the code for all the functions was
contained in a single program file. In general, however, this
need not be the case; in particular, it will not be the case if
you wish to make use of the Standard Library.

In principle, of course, the Standard Library could be
distributed simply as a set of one or more  source code files.
You would then use a library function simply by copying the
source code out of the relevant library file and pasting it into
your own program file. However, this is not the mechanism
which is used in practise for a variety of reasons:



Since the code for the Standard Library is rarely if ever
modified, it is very inefficient to have to recompile or
retranslate it every time a programmer makes a change to his own
code.

The library functions are, in some cases, not actually
written in the  language at all.

The compiler suppliers often prefer not to distribute
source code for the Standard Library as this would facilitate
unscrupulous pirating of their work (although this does
not apply to DJGPP).



What is actually done is that the Standard Library software is
distributed in a sort of canned or precompiled form,
as a set of object files (or, more commonly, a single
object library file). Your program file(s) are then compiled
completely separately from the source code for the Standard
Library, yielding one or more further object files. Finally, all
the required object files, both your own, and those making up
the Standard Library, are combined or linked together to
yield the executable file, which can actually be run.

To a large extent this process can be made automatic and
invisible. The compiler compiles your source file(s) to the
object code form, and then automatically links these together
with the any required library object file(s).

However, it cannot be made completely automatic. In
practise there is certain information, associated with the
Standard Library, which the compiler must be made aware
of at the time it is compiling your source files, in
order to make sure that the object files produced will correctly
link up with the object files for the Standard Library.

Since, for the most part, the Standard Library simply consists of
a set of functions which you can call in your program,
the most important such information is the way data is to
be exchanged with each such function; i.e. how many 
parameters does the function take, of what types, in what
order, and what type of return value does the function
yield (if any)?

The compiler could, of course, try to infer this information from
the way the function is actually called; but this is not always
technically possible, and, in any case, it is much better if
the compiler has some independent way of knowing how the
function should be called, because then it can automatically
check whether you have, in each case, called it correctly. It
turns out that this is extremely useful, and is an effective way
of automatically detecting a range of very common programming
mistakes.

The compiler is given this information about how a function
should be called by the use of a so-called function
prototype. This is simply the header of the function
definition, which states the function name, the return
type, and the formal parameter list. But whereas, in a function
definition this is then followed by a compound statement which
actually defines the function, in a function prototype it is
simply followed by a terminating semicolon. Thus, a prototype
might look like this for example:






This tells the compiler that the function called
multiply should take three parameters, respectively of
types (unsigned *), unsigned and (unsigned *), and
will yield a return value of type . Function
prototypes should appear at the outermost level of a source
filei.e. not within the definition of any function.

So far, so good. It seems that if you wish to call or invoke any
of the Standard Library functions in your program,, you must
simply insert, somewhere before the function(s) which make such
call(s), a suitable function prototype, so that the compiler will
then be able to decide whether the calls are correct or not.

But how are you going to know what is the correct prototype for
each function in the first place?

Well, one possibility is to look up the function in the detailed
technical documentation for the Standard Library: this will
normally include the function prototype.  You can then copy that
into your own file.

However, this is clearly unsatisfactory.  Apart from being
laborious, it is error proneand the possibility of an error in
the function prototype undermines a primary point of using
prototypes in the first place, namely that it allows the compiler
to crosscheck for valid function calls!  A better idea is if the
prototypes are provided to you in a machine readable formi.e.
in one or more files on the computer. Then you can simply copy
the required ones, and paste them into your own source files.

Well, yes, this is a major improvement, but still leaves
something to be desired. For one thing, duplicating the
prototypes in every source program wastes disk space. More
seriously, there is always a danger that you might (accidentally)
modify a prototype, again confounding the idea of allowing the
compiler to automatically crosscheck the prototype against the
invocation(s) of the function.

A final possibility is to leave the prototypes in one or more separate
files; but have the compiler automatically access or scan the
relevant files immediately before, or as part of the process of,
compiling your files.  In this way, there is no
laborious, manual, copying of the prototypes, but no duplication
and no risk of accidental modification either.

Files which are used for this kind of purposewhich contain no
actual executable code, but which contain only function prototypes
(and possibly other things such as symbolic constants etc.) which
allow the compiler to correctly mesh the file it is compiling
with some other software which has been pre-compiled, are
called header files. Header files are normally given the
extension .h to distinguish them from files which actually
contain executable codefunction definitions etc.which will
normally have the extension .c .

It would be possible, in principle, to put all the prototypes for
all the functions in the Standard Library, plus any other
required information (symbolic constants etc.), in a single
.h file, and have the compiler automatically include it in
compiling any file. However, in practise this is not done for
various reasons. For example, most .c files only involve
calling a small subset of the functions in the Standard Library;
it is then wasteful and time consuming for the compiler to
process prototypes for all functions in the Standard
Library.

The mechanism that is actually used then is as follows.  A series
of separate .h files are supplied with the compiler. Each
one provides prototypes (plus other required information) relating
to only some coherent or related subset of the functions
in the Standard Library. The programmer must then explicitly
instruct the compiler to process just those .h files which
are required in order to properly compile any particular .c
file.  And this is done by inserting, into the .c file one
or more so-called include directives. A include
directive is something much the same as a define in the
sense that it is not handled by the compiler proper but by
the pre-processor which runs (automatically) immediately
before the compiler.  In this case, the pre-processor processes a
include by concatenating together the .h file and
the original .c file, to produce a big temporary file which
is what is actually then processed in the compilation phase proper.

Thus, if you want to use a function from the Standard Library,
you must first look up, in the relevant technical documentation,
which header file contains the prototype etc. for that function;
and then insert a line something like the following in your
.c file:




The angle brackets around the name of the header file tell the
pre-processor to search for the file in the standard
directories for such things; normally these will be set up when
the compiler is installed, and you, as a programmer, need not
worry about what these standard directories actually are.

You may, of course, need to include several header files,
depending on the particular selection of Standard Library
functions you wish to use.

All required include directives are normally placed close
to the top of your .c filetypically either at the very
top, or immediately after an initial introductory comment which
documents the overall contents of the source file. Each
include must, in any case, precede any calls or
invocations of the relevant functions.

The rest of this essay is concerned with introducing just a very
small selection of the several hundred functions normally
available in the Standard Library. It is organised into sections
according to the distinct .h files required.

Implementation-defined Limits: limits.h

The header file limits.h essentially just provides
define directives defining symbolic constants for the
maximum and minimum values allowed with the standard  integral
data typesi.e. it does not provide any function prototypes
as such. It is important to be able to refer to these values in
your programs in order to prevent, or at least detect, overflow
situations. The values potentially differ from one  compiler
to another (hence implementation-defined), but the symbolic
constants defined in limits.h always have the same names.
Thus by using the symbolic constants to refer to these limits it
should be possible to write your programs in such a way that they
will automatically adjust to whatever the limits actually are
with any particular compiler.

Some constants which are commonly used are:


[INTMAX :] maximum value of 
[INTMIN :] minimum value of 
[LONGMAX :] maximum value of 
[LONGMIN :] minimum value of 
[UINTMAX :] maximum value of unsigned int
[ULONGMAX :] maximum value of unsigned long


This is not an exhaustive list: examine a copy of limits.h
for yourself to see others. However, note that there is, of
course, no need for constants called, say, UINTMIN or
ULONGMIN since these minima are always guaranteed to be
simply zero.

Error Conditions: errno.h

Many functions in the standard library will detect exception
or error conditions in certain circumstancesessentially if
the function has been requested to do something which, for some
reason, it can't. The exact action of the function in such
situations depends on the details of the particular function and
the particular exception condition. However, the most usual
strategy is for the return value from the function to
signal, with some special value, that something has gone wrong.
It is then up to the calling site to react to this in some
appropriate way.  Minimally this will mean giving some kind
of overt or visible external signal of the problem.

In any case, while the standard library functions typically provide
an initial or gross indication that something has gone wrong via
the return value, it is often also useful for the calling
site to have access to more detailed information which clarifies
exactly the nature of the problem. This is usually achieved by
having the standard library function record a detailed error
number in a global  variable called errno.  This
variable is defined within the standard library itself: but your
programs can gain access to it by a so-called extern
declaration. This declaration is already provided in the header
file errno.h; so if you include this, you will then
be able to access errno just as any other variable is
accessed. By examining its value immediately after a call to a
standard function, your program can generally establish what (if
anything) went wrong. 

As well as the declaration of errno the file errno.h
also provides a series of define directives which define
symbolic names for the standard error numbers or codes which may
be recorded in errno. This would allow your program to test
for specific error codes by comparing errno to these
symbolic values. We shall also see later how errno can be
automatically translated into a corresponding textual error
message, and, say, displayed, on the computer screen (see the
discussion of the function perror(), prototyped in
stdio.h).

Utility Functions: stdlib.h

The header file stdlib provides prototypes for a selection
of miscellaneous utility functions, as well as a few more
symbolic constants. A small selection of the more commonly used
functions is as follows:

*int atoi(char *s)

This takes a string representation of an integer (i.e. a string
something like "145" or "-9999" or "000000"
etc.) and converts it into the corresponding internal
representation of type , which is the return value
from the function.

The conversion will ignore any trailing data in the string
(e.g. "143yu*" gets converted as 123).

The behaviour where the answer would overflow 
(i.e. be greater than INTMAX or less than
INTMIN) is generally unpredictable or
indeterminateso it is your
responsibility to make sure this never arises; if the conversion
simply cannot be done (e.g. atoi("xyz") the return
value will be 0.

*long atol(char *s)

This takes a string representation of an integer (i.e.
a string something like "145" or "-999999" or
"25763178" etc.) and converts it into the corresponding
internal representation of type , which is the return
value from the function. Leading white space in the string is
ignored.

Again, the conversion will ignore any trailing data in the
string, and the behaviour where the answer would overflow 
(i.e. be greater than LONGMAX or less than
LONGMIN)
is indeterminate; and if the conversion
simply cannot be done (e.g. atol("")) the return
value will be 0L.

*void exit(int status)

This function may be called to forcibly terminate the program at
any point. The parameter status is simply a number which
is, in some sense, made available to the external environment
of the program.In the case of programs run under
  DOS, the exit() status can be accessed
  via the errorlevel parameter in the DOS if
  commandthough this is normally only used in batch
  files.
In any case, if you are using exit() to terminate your
program, you should normally just give it one of two pre-defined
exit values which have been given symbolic names in
stdlib.h: use exit(EXITSUCCESS) if the program is
terminating normally and exit(EXITFAILURE) if it is
terminating because of some unexpected or intolerable exception
or error being encountered.

Input and Output: stdio.h

stdio.h (standard input/output) provides basic
functions for accessing data external to a program. This
naturally includes data in files on disk, but also covers data
coming from the keyboard, or displayed on the screen, or data
routed via any of the other input/output ports of the
computer. All of these sources or destinations for data may be
generically referred to as streams, or, more simply, 
files.

Files are classified into two kinds: text and 
binary. A text file is divided into lines, where each line
has zero or more characters, and is terminated by a newline
character '















































































































































































































' ; this is a rather unfortunate historical
accident, as this character also has a special meaning in 
strings, as an escape character. The nett effect is that, in
representing a DOS file name in a  string constant, you
must write the directory separator as a double backslash:



















0 ').

mode is a string specifying the desired file access
modei.e. what kind(s) of operations are going to be carried
out on it. Two examples of legal values for this would be:


"r" : Open the file for reading.
"w" : Open the file for writing.


In general, fopen() will open a file for access
either as a binary or a text filebut the default
is implementation specific. In any case, regardless of the default,
the type of access can be explicitly specified in the mode
string by adding either a t or b character for text
or binary respectively, e.g.:


"rt" : Open for reading as a text file.
"wb" : Open for writing as a binary file.


If the call to fopen() is successful, then the return
value is simply the value of the created file pointer. However,
if the call fails for any reason (e.g. trying to open a file on
device "X:" when no such device is present on the machine)
then the return value will be the special zero or 
null pointer value defined in . stdio.h (and, indeed,
several of the other header files) define the symbolic name
NULL for this value. Thus, after a call to fopen()
your program should always check the return value to see
whether it is NULLand take some appropriate action
if so (e.g. issue a suitable message and call exit()). In
any case, if the return value from fopen() is
set to NULL then errno will also be set to some more
specific error code.

*int fgetc(FILE *stream)

fgetc() reads the next sequential character (byte) from the
open file identified by the file pointer stream. This
character is converted into an  value, and is the
return value from the function. However, if the end of file
is encountered (the last character has already been read), or if
any other error is encountered in reading, then the return
value will be EOF, and errno will be set
appropriately. In the case of a text file, the newline character
'

























































































































































n' 
to the relevant stream. Try to predict the
precise effect of this sequence of statements for example:








Note that because the  character in the
format string is interpreted in a special way by
fprintf() (namely as introducing a conversion
specification), there is a problem if one actually wants to just
output this character. To get around this you just put in two
successive  characters, thus:
 . The second one will cause fprintf()
to recognise that this is not a conversion specification after
all, and it will simply output one  character
without further ado (and without attempting to convert any
argument).

The return value from fprintf() is the number of
characters which have been written out, or EOF if any error
has been detected.  programmers commonly neglect to check this
return value from fprintf(), presumably because error
conditions which it can detect and signal are rather rare.
Nonetheless, as a general practice I would recommend that you
include code in your programs to check this return value (at
least to see if it is EOF), unless the your usage is
extremely trivial (e.g. just printing the format string,
without having any extra arguments to be converted).

*int printf(char *format, ...)

printf(format, ...) is equivalent to fprintf(stdout,
format, ...).

*int fscanf(FILE *stream, char *format, ...)

fscanf() is more or less the converse to fprintf():
it reads characters from stream and attempts to convert
them into internal representations of appropriate types, as
determined by conversion specifications in the string
format. The extra arguments must all be pointer types in
this case. Typically, these arguments will simply be the
addresses of variables (generated with the 
operator) into which the converted values are to be stored.
fprintf() thus simply stores each converted value at the
location pointed at by each extra argument in turn.  Again, it is
up to you, the programmer, to ensure that the arguments match the
conversion specifications correctly; and again, if they do not,
then this will not be automatically detected, and will have
entirely unpredictable effects.

The conversion specifications handled by fscanf() are quite
similar in format to those handled by fprintf(). They will
not be described further here.

The return value from fscanf() is the number of input
fields or values successfully scanned, converted, and stored, or
EOF if end of file or any error is encountered. In general,
it is a very good idea for your programs to check this
return
value to see that it has the expected value.

*int scanf(char *format, ...)

scanf(format, ...) is equivalent to fscanf(stdin,
format, ...).

*int fclose(FILE *stream)

fclose() simply closes the open file associated with
stream. The return value is zero if this is completed
successfully; otherwise it is EOF and errno is set.

While open files should be closed automatically if or
when your program exits anyway, it is still a good practice to
explicitly close files with fclose() when your program is
finished with them. This makes the behaviour of the program more
intelligible to someone reading it, and also provides a little
extra robustness against the possibility that your program
might terminate abnormally (i.e. CRASH!).

*void perror(char *s)

perror(s) outputs the string s on stderr,
followed by an error message which describes or elaborates the
error code currently stored in errno.  It is
equivalent to:

fprintf(stderr, "
message")

where error message is a string describing whatever error
condition is represented by the current value of errno.

perror() might usefully be called on any occasion when your
program detects that a standard library function has failed to
operate as expected.

Diagnostics: assert.h

assert.h provides the declaration of one function-like
object with the following prototype:

*void assert(int status)

For technical reasons, this object is not implemented as
a function in the normal sense, but is what is called a 
macro instead. However, for most practical purposes, it can be
regarded just as any normal function.

If status is non-zero then assert() will have no
effect; but if status is zero then a message of the
following form will be output to stderr:

Assertion failed:  status, file 
filename, line nnn

where status is the value of status (i.e. zero),
filename is the name of the source file containing the
call to assert(), and nnn is the line number in
that file at which the call appears. assert() then causes
the program to terminate.

assert() is potentially a very useful way of dealing with
error or exception conditions detected within your programs, where
the condition is such that you are unable (or unwilling) to try
to make the program deal with it in any more intelligent way.
The beauty of assert() is that, if it is activated, the
resulting message immediately pinpoints exactly the position in your
source code where the problem is detected. However, note that
this benefit will be almost totally lost if you place the
invocation of assert() within a function of your own, which
you then call from wherever the exception is detected: for then the
file name and line number output by assert() will always be
the same (namely within your exception handler) and will not give any
indication of the real position at which the problem was
detected.

Character Class Tests: ctype.h

ctype.h provides prototypes for a selection of functions
which test a  value for membership of a given class. In
each case the function takes a single argument, which is technically
of type int, but which represents a  value or,
possibly, EOF; this is for compatibility with fgetc()
as discussed earlier. The return value is always of type , being
zero (false) if the value is not a member of the relevant class,
or non-zero (true) if it is.

Some examples are as follows:

int isalpha(int c) : Letter (i.e. alphabetic)?
int isdigit(int c) : Digit?
int isalnum(int c) : Letter or digit
(alphanumeric)?
int islower(int c) : Lower case letter?
int isupper(int c) : Upper case letter?
int isspace(int c) : Whitespace (i.e. space,
newline, tab, etc.)?
int isprint(int c) : Printable or displayable
(including space)?


ctype.h also provides two functions which convert the case
of letters:

int tolower(int c)
int toupper(int c)


String Functions: string.h

string.h provides prototypes for a wide range of functions
which operate upon strings in various ways. A brief description
of a small sample of these follows. Note that where one of these
functions potentially modifies a string, it might, in
general, make it longer; therefore in passing in a
pointer to the string which will be modified, you must make sure
that the  array holding this string is big enough to hold
the longest possible string (including the nul terminator) which
might conceivably result from the string operation(s). If you
fail to watch out for this the results will generally be
unpredictableand definitely not very pleasant.

*char *strcpy(char *s, char *t)

Copy the string pointed to by t to that pointed to by
s. The return value is simply s (i.e. a pointer to
the destination string); this is sometimes
convenient in forming more complex expressions, but, more
usually, it can be ignored (discarded).

*char *strcat(char *s, char *t)

Concatenate the strings pointed at by s and t; the
result is pointed to by s.  That is, in effect, the
characters from string t are added on at the end of string
s. The return value is again simply s.

*int strcmp(char *s, char *t)

Compare the strings pointed to by s and t.  The
return value is zero if the two strings are identical; it
is negative if s would come before t in an
alphabetical ordering; and positive if t would come before
s.

Mathematical Functions: math.h

*double sin(double x)

The return value is the sine of x,
where x is an angle expressed in radians.

*double cosin(double x)

The return value is the cosine of x,
where x is an angle expressed in radians.

*double tan(double x)

The return value is the tangent of x,
where x is an angle expressed in radians.

*double asin(double x)

The return value is the inverse sine 
or arcsine of x.
x must be in the interval .
The return value will be expressed in radians
and lie in the interval .

*double acos(double x)

The return value is the inverse cosine 
or arccosine of x.
x must be in the interval .
The return value will be expressed in radians
and lie in the interval .

*double atan(double x)

The return value is the inverse tangent 
or arctangent of x.
The return value will be expressed in radians
and lie in the interval .

*double atan2(double y, double x)

The return value is the inverse tangent 
or arctangent of y/x.
The return value will be expressed in radians
and lie in the interval .

The advantage of atan2() over atan()
is that, because it is given separate access to the two
original (rectangular) co-ordinates, it is able
to distinguish beween angles in the left and right
half planes (which is not possible if only
y/x is passed in, as is the case with atan()).

*double exp(double x)

The return value is the exponential 
function of x, i.e. .

*double log(double x)

The return value is the natural
logarithm 
of x, i.e. . x must be greater
than zero.

*double log10(double x)

The return value is the base 10
logarithm 
of x, i.e. . x must be
greater than zero.

*double pow(double x, double y)

The return value is x
raised to the power y, i.e. .
It is an error to invoke this function
with x equal to zero and y
less than or equal to zero; or with x
less than zero and y not an
integery is always of
type double; but it can have
an integer value, e.g. 4.0..

*double sqrt(double x)

The return value is the square root 
of x, i.e. . x must be
greater than or equal to zero.

Conclusion

This essay serves only to skim the surface of the facilities
offered by the  Standard Library. It is well worth your while
to become reasonably familiar with the full range of services
which are available in the library: it can speed up programming
of many applications very considerably.

Copyright

This Hypermedia Document is copyrighted,  1994-96, by
Barry McMullin
http://www.eeng.dcu.ie/7Emcmullin.

Permission is hereby granted to access, copy, or store this work, in
whole or in part, for purposes of individual private study only. The
work may not be accessed or copied, in whole or in part, for
commercial purposes, except with the prior written permission of the
author.