[html,12pt,a4]article C safe-c [2]#1#2 int long char [1]#1 [1]"#1" --- < > htmlonly [1]#1 [1]"#1" [1]#1 - < > htmlonly The Standard Library Last Modified: 29th March 1996 Barry McMullin Note that this complete document is available in LaTexstdlib.tex and plain ASCIIstdlib.txt forms, to allow downloading and offline browsing. Introduction This document is one component of the hypermedia documentation for the course Software Engineering 1 http://www.eeng.dcu.ie/7Emcmullin/swe1/swe1root/swe1root.html. It presents more detailed notes on the specific topic of the Standard Library. On Re-inventing Wheels It is well known that one should not waste time re-inventing the wheel. In Engineering, this means not redesigning something one has already got a perfectly satisfactory design for. In Software Engineering, it means not rewriting software to perform operations that one has previously written (and tested!) software for. In the case of the language there are a range of things that programmers very frequently wish to do, which are so common that standard software for the purpose is actually distributed along with every compiler. This software is called the Standard Library. You are more or less guaranteed that every compiler will come with an implementation of the Standard Library. It is important to be aware at least of the existence of the Standard Library, and to have some outline idea of the software contained in the Library. In this way you can avoid unnecessarily writing and testing software which is already effectively available, in a well tested form, for free. The recommended compiler for the current course is DJGPP http://www.delorie.com/djgpp, which is a port (version) of the GNU C Compiler for the PC platform. Detailed information on the Standard Library (Version 1.12) provided with this compiler is available in the Web-based documentation http://www.delorie.com/djgpp/doc/libc-1.12/libctoc.html. You should consult this whenever you need details of the full range of functionality of the Standard Library. On the other hand, this essay is intended only to provide an introduction to a very small subset of the Standard Library, and is not a substitute for full DJGPP documentation. Mechanics In order to use the facilities of the Standard Library, one must have some understanding of the mechanics of how the compiler will access it. Some very simple programs can be completely self-contained: all the code for all the functions was contained in a single program file. In general, however, this need not be the case; in particular, it will not be the case if you wish to make use of the Standard Library. In principle, of course, the Standard Library could be distributed simply as a set of one or more source code files. You would then use a library function simply by copying the source code out of the relevant library file and pasting it into your own program file. However, this is not the mechanism which is used in practise for a variety of reasons: Since the code for the Standard Library is rarely if ever modified, it is very inefficient to have to recompile or retranslate it every time a programmer makes a change to his own code. The library functions are, in some cases, not actually written in the language at all. The compiler suppliers often prefer not to distribute source code for the Standard Library as this would facilitate unscrupulous pirating of their work (although this does not apply to DJGPP). What is actually done is that the Standard Library software is distributed in a sort of canned or precompiled form, as a set of object files (or, more commonly, a single object library file). Your program file(s) are then compiled completely separately from the source code for the Standard Library, yielding one or more further object files. Finally, all the required object files, both your own, and those making up the Standard Library, are combined or linked together to yield the executable file, which can actually be run. To a large extent this process can be made automatic and invisible. The compiler compiles your source file(s) to the object code form, and then automatically links these together with the any required library object file(s). However, it cannot be made completely automatic. In practise there is certain information, associated with the Standard Library, which the compiler must be made aware of at the time it is compiling your source files, in order to make sure that the object files produced will correctly link up with the object files for the Standard Library. Since, for the most part, the Standard Library simply consists of a set of functions which you can call in your program, the most important such information is the way data is to be exchanged with each such function; i.e. how many parameters does the function take, of what types, in what order, and what type of return value does the function yield (if any)? The compiler could, of course, try to infer this information from the way the function is actually called; but this is not always technically possible, and, in any case, it is much better if the compiler has some independent way of knowing how the function should be called, because then it can automatically check whether you have, in each case, called it correctly. It turns out that this is extremely useful, and is an effective way of automatically detecting a range of very common programming mistakes. The compiler is given this information about how a function should be called by the use of a so-called function prototype. This is simply the header of the function definition, which states the function name, the return type, and the formal parameter list. But whereas, in a function definition this is then followed by a compound statement which actually defines the function, in a function prototype it is simply followed by a terminating semicolon. Thus, a prototype might look like this for example: This tells the compiler that the function called multiply should take three parameters, respectively of types (unsigned *), unsigned and (unsigned *), and will yield a return value of type . Function prototypes should appear at the outermost level of a source filei.e. not within the definition of any function. So far, so good. It seems that if you wish to call or invoke any of the Standard Library functions in your program,, you must simply insert, somewhere before the function(s) which make such call(s), a suitable function prototype, so that the compiler will then be able to decide whether the calls are correct or not. But how are you going to know what is the correct prototype for each function in the first place? Well, one possibility is to look up the function in the detailed technical documentation for the Standard Library: this will normally include the function prototype. You can then copy that into your own file. However, this is clearly unsatisfactory. Apart from being laborious, it is error proneand the possibility of an error in the function prototype undermines a primary point of using prototypes in the first place, namely that it allows the compiler to crosscheck for valid function calls! A better idea is if the prototypes are provided to you in a machine readable formi.e. in one or more files on the computer. Then you can simply copy the required ones, and paste them into your own source files. Well, yes, this is a major improvement, but still leaves something to be desired. For one thing, duplicating the prototypes in every source program wastes disk space. More seriously, there is always a danger that you might (accidentally) modify a prototype, again confounding the idea of allowing the compiler to automatically crosscheck the prototype against the invocation(s) of the function. A final possibility is to leave the prototypes in one or more separate files; but have the compiler automatically access or scan the relevant files immediately before, or as part of the process of, compiling your files. In this way, there is no laborious, manual, copying of the prototypes, but no duplication and no risk of accidental modification either. Files which are used for this kind of purposewhich contain no actual executable code, but which contain only function prototypes (and possibly other things such as symbolic constants etc.) which allow the compiler to correctly mesh the file it is compiling with some other software which has been pre-compiled, are called header files. Header files are normally given the extension .h to distinguish them from files which actually contain executable codefunction definitions etc.which will normally have the extension .c . It would be possible, in principle, to put all the prototypes for all the functions in the Standard Library, plus any other required information (symbolic constants etc.), in a single .h file, and have the compiler automatically include it in compiling any file. However, in practise this is not done for various reasons. For example, most .c files only involve calling a small subset of the functions in the Standard Library; it is then wasteful and time consuming for the compiler to process prototypes for all functions in the Standard Library. The mechanism that is actually used then is as follows. A series of separate .h files are supplied with the compiler. Each one provides prototypes (plus other required information) relating to only some coherent or related subset of the functions in the Standard Library. The programmer must then explicitly instruct the compiler to process just those .h files which are required in order to properly compile any particular .c file. And this is done by inserting, into the .c file one or more so-called include directives. A include directive is something much the same as a define in the sense that it is not handled by the compiler proper but by the pre-processor which runs (automatically) immediately before the compiler. In this case, the pre-processor processes a include by concatenating together the .h file and the original .c file, to produce a big temporary file which is what is actually then processed in the compilation phase proper. Thus, if you want to use a function from the Standard Library, you must first look up, in the relevant technical documentation, which header file contains the prototype etc. for that function; and then insert a line something like the following in your .c file: The angle brackets around the name of the header file tell the pre-processor to search for the file in the standard directories for such things; normally these will be set up when the compiler is installed, and you, as a programmer, need not worry about what these standard directories actually are. You may, of course, need to include several header files, depending on the particular selection of Standard Library functions you wish to use. All required include directives are normally placed close to the top of your .c filetypically either at the very top, or immediately after an initial introductory comment which documents the overall contents of the source file. Each include must, in any case, precede any calls or invocations of the relevant functions. The rest of this essay is concerned with introducing just a very small selection of the several hundred functions normally available in the Standard Library. It is organised into sections according to the distinct .h files required. Implementation-defined Limits: limits.h The header file limits.h essentially just provides define directives defining symbolic constants for the maximum and minimum values allowed with the standard integral data typesi.e. it does not provide any function prototypes as such. It is important to be able to refer to these values in your programs in order to prevent, or at least detect, overflow situations. The values potentially differ from one compiler to another (hence implementation-defined), but the symbolic constants defined in limits.h always have the same names. Thus by using the symbolic constants to refer to these limits it should be possible to write your programs in such a way that they will automatically adjust to whatever the limits actually are with any particular compiler. Some constants which are commonly used are: [INTMAX :] maximum value of [INTMIN :] minimum value of [LONGMAX :] maximum value of [LONGMIN :] minimum value of [UINTMAX :] maximum value of unsigned int [ULONGMAX :] maximum value of unsigned long This is not an exhaustive list: examine a copy of limits.h for yourself to see others. However, note that there is, of course, no need for constants called, say, UINTMIN or ULONGMIN since these minima are always guaranteed to be simply zero. Error Conditions: errno.h Many functions in the standard library will detect exception or error conditions in certain circumstancesessentially if the function has been requested to do something which, for some reason, it can't. The exact action of the function in such situations depends on the details of the particular function and the particular exception condition. However, the most usual strategy is for the return value from the function to signal, with some special value, that something has gone wrong. It is then up to the calling site to react to this in some appropriate way. Minimally this will mean giving some kind of overt or visible external signal of the problem. In any case, while the standard library functions typically provide an initial or gross indication that something has gone wrong via the return value, it is often also useful for the calling site to have access to more detailed information which clarifies exactly the nature of the problem. This is usually achieved by having the standard library function record a detailed error number in a global variable called errno. This variable is defined within the standard library itself: but your programs can gain access to it by a so-called extern declaration. This declaration is already provided in the header file errno.h; so if you include this, you will then be able to access errno just as any other variable is accessed. By examining its value immediately after a call to a standard function, your program can generally establish what (if anything) went wrong. As well as the declaration of errno the file errno.h also provides a series of define directives which define symbolic names for the standard error numbers or codes which may be recorded in errno. This would allow your program to test for specific error codes by comparing errno to these symbolic values. We shall also see later how errno can be automatically translated into a corresponding textual error message, and, say, displayed, on the computer screen (see the discussion of the function perror(), prototyped in stdio.h). Utility Functions: stdlib.h The header file stdlib provides prototypes for a selection of miscellaneous utility functions, as well as a few more symbolic constants. A small selection of the more commonly used functions is as follows: *int atoi(char *s) This takes a string representation of an integer (i.e. a string something like "145" or "-9999" or "000000" etc.) and converts it into the corresponding internal representation of type , which is the return value from the function. The conversion will ignore any trailing data in the string (e.g. "143yu*" gets converted as 123). The behaviour where the answer would overflow (i.e. be greater than INTMAX or less than INTMIN) is generally unpredictable or indeterminateso it is your responsibility to make sure this never arises; if the conversion simply cannot be done (e.g. atoi("xyz") the return value will be 0. *long atol(char *s) This takes a string representation of an integer (i.e. a string something like "145" or "-999999" or "25763178" etc.) and converts it into the corresponding internal representation of type , which is the return value from the function. Leading white space in the string is ignored. Again, the conversion will ignore any trailing data in the string, and the behaviour where the answer would overflow (i.e. be greater than LONGMAX or less than LONGMIN) is indeterminate; and if the conversion simply cannot be done (e.g. atol("")) the return value will be 0L. *void exit(int status) This function may be called to forcibly terminate the program at any point. The parameter status is simply a number which is, in some sense, made available to the external environment of the program.In the case of programs run under DOS, the exit() status can be accessed via the errorlevel parameter in the DOS if commandthough this is normally only used in batch files. In any case, if you are using exit() to terminate your program, you should normally just give it one of two pre-defined exit values which have been given symbolic names in stdlib.h: use exit(EXITSUCCESS) if the program is terminating normally and exit(EXITFAILURE) if it is terminating because of some unexpected or intolerable exception or error being encountered. Input and Output: stdio.h stdio.h (standard input/output) provides basic functions for accessing data external to a program. This naturally includes data in files on disk, but also covers data coming from the keyboard, or displayed on the screen, or data routed via any of the other input/output ports of the computer. All of these sources or destinations for data may be generically referred to as streams, or, more simply, files. Files are classified into two kinds: text and binary. A text file is divided into lines, where each line has zero or more characters, and is terminated by a newline character ' ' ; this is a rather unfortunate historical accident, as this character also has a special meaning in strings, as an escape character. The nett effect is that, in representing a DOS file name in a string constant, you must write the directory separator as a double backslash: 0 '). mode is a string specifying the desired file access modei.e. what kind(s) of operations are going to be carried out on it. Two examples of legal values for this would be: "r" : Open the file for reading. "w" : Open the file for writing. In general, fopen() will open a file for access either as a binary or a text filebut the default is implementation specific. In any case, regardless of the default, the type of access can be explicitly specified in the mode string by adding either a t or b character for text or binary respectively, e.g.: "rt" : Open for reading as a text file. "wb" : Open for writing as a binary file. If the call to fopen() is successful, then the return value is simply the value of the created file pointer. However, if the call fails for any reason (e.g. trying to open a file on device "X:" when no such device is present on the machine) then the return value will be the special zero or null pointer value defined in . stdio.h (and, indeed, several of the other header files) define the symbolic name NULL for this value. Thus, after a call to fopen() your program should always check the return value to see whether it is NULLand take some appropriate action if so (e.g. issue a suitable message and call exit()). In any case, if the return value from fopen() is set to NULL then errno will also be set to some more specific error code. *int fgetc(FILE *stream) fgetc() reads the next sequential character (byte) from the open file identified by the file pointer stream. This character is converted into an value, and is the return value from the function. However, if the end of file is encountered (the last character has already been read), or if any other error is encountered in reading, then the return value will be EOF, and errno will be set appropriately. In the case of a text file, the newline character ' n' to the relevant stream. Try to predict the precise effect of this sequence of statements for example: Note that because the character in the format string is interpreted in a special way by fprintf() (namely as introducing a conversion specification), there is a problem if one actually wants to just output this character. To get around this you just put in two successive characters, thus: . The second one will cause fprintf() to recognise that this is not a conversion specification after all, and it will simply output one character without further ado (and without attempting to convert any argument). The return value from fprintf() is the number of characters which have been written out, or EOF if any error has been detected. programmers commonly neglect to check this return value from fprintf(), presumably because error conditions which it can detect and signal are rather rare. Nonetheless, as a general practice I would recommend that you include code in your programs to check this return value (at least to see if it is EOF), unless the your usage is extremely trivial (e.g. just printing the format string, without having any extra arguments to be converted). *int printf(char *format, ...) printf(format, ...) is equivalent to fprintf(stdout, format, ...). *int fscanf(FILE *stream, char *format, ...) fscanf() is more or less the converse to fprintf(): it reads characters from stream and attempts to convert them into internal representations of appropriate types, as determined by conversion specifications in the string format. The extra arguments must all be pointer types in this case. Typically, these arguments will simply be the addresses of variables (generated with the operator) into which the converted values are to be stored. fprintf() thus simply stores each converted value at the location pointed at by each extra argument in turn. Again, it is up to you, the programmer, to ensure that the arguments match the conversion specifications correctly; and again, if they do not, then this will not be automatically detected, and will have entirely unpredictable effects. The conversion specifications handled by fscanf() are quite similar in format to those handled by fprintf(). They will not be described further here. The return value from fscanf() is the number of input fields or values successfully scanned, converted, and stored, or EOF if end of file or any error is encountered. In general, it is a very good idea for your programs to check this return value to see that it has the expected value. *int scanf(char *format, ...) scanf(format, ...) is equivalent to fscanf(stdin, format, ...). *int fclose(FILE *stream) fclose() simply closes the open file associated with stream. The return value is zero if this is completed successfully; otherwise it is EOF and errno is set. While open files should be closed automatically if or when your program exits anyway, it is still a good practice to explicitly close files with fclose() when your program is finished with them. This makes the behaviour of the program more intelligible to someone reading it, and also provides a little extra robustness against the possibility that your program might terminate abnormally (i.e. CRASH!). *void perror(char *s) perror(s) outputs the string s on stderr, followed by an error message which describes or elaborates the error code currently stored in errno. It is equivalent to: fprintf(stderr, " message") where error message is a string describing whatever error condition is represented by the current value of errno. perror() might usefully be called on any occasion when your program detects that a standard library function has failed to operate as expected. Diagnostics: assert.h assert.h provides the declaration of one function-like object with the following prototype: *void assert(int status) For technical reasons, this object is not implemented as a function in the normal sense, but is what is called a macro instead. However, for most practical purposes, it can be regarded just as any normal function. If status is non-zero then assert() will have no effect; but if status is zero then a message of the following form will be output to stderr: Assertion failed: status, file filename, line nnn where status is the value of status (i.e. zero), filename is the name of the source file containing the call to assert(), and nnn is the line number in that file at which the call appears. assert() then causes the program to terminate. assert() is potentially a very useful way of dealing with error or exception conditions detected within your programs, where the condition is such that you are unable (or unwilling) to try to make the program deal with it in any more intelligent way. The beauty of assert() is that, if it is activated, the resulting message immediately pinpoints exactly the position in your source code where the problem is detected. However, note that this benefit will be almost totally lost if you place the invocation of assert() within a function of your own, which you then call from wherever the exception is detected: for then the file name and line number output by assert() will always be the same (namely within your exception handler) and will not give any indication of the real position at which the problem was detected. Character Class Tests: ctype.h ctype.h provides prototypes for a selection of functions which test a value for membership of a given class. In each case the function takes a single argument, which is technically of type int, but which represents a value or, possibly, EOF; this is for compatibility with fgetc() as discussed earlier. The return value is always of type , being zero (false) if the value is not a member of the relevant class, or non-zero (true) if it is. Some examples are as follows: int isalpha(int c) : Letter (i.e. alphabetic)? int isdigit(int c) : Digit? int isalnum(int c) : Letter or digit (alphanumeric)? int islower(int c) : Lower case letter? int isupper(int c) : Upper case letter? int isspace(int c) : Whitespace (i.e. space, newline, tab, etc.)? int isprint(int c) : Printable or displayable (including space)? ctype.h also provides two functions which convert the case of letters: int tolower(int c) int toupper(int c) String Functions: string.h string.h provides prototypes for a wide range of functions which operate upon strings in various ways. A brief description of a small sample of these follows. Note that where one of these functions potentially modifies a string, it might, in general, make it longer; therefore in passing in a pointer to the string which will be modified, you must make sure that the array holding this string is big enough to hold the longest possible string (including the nul terminator) which might conceivably result from the string operation(s). If you fail to watch out for this the results will generally be unpredictableand definitely not very pleasant. *char *strcpy(char *s, char *t) Copy the string pointed to by t to that pointed to by s. The return value is simply s (i.e. a pointer to the destination string); this is sometimes convenient in forming more complex expressions, but, more usually, it can be ignored (discarded). *char *strcat(char *s, char *t) Concatenate the strings pointed at by s and t; the result is pointed to by s. That is, in effect, the characters from string t are added on at the end of string s. The return value is again simply s. *int strcmp(char *s, char *t) Compare the strings pointed to by s and t. The return value is zero if the two strings are identical; it is negative if s would come before t in an alphabetical ordering; and positive if t would come before s. Mathematical Functions: math.h *double sin(double x) The return value is the sine of x, where x is an angle expressed in radians. *double cosin(double x) The return value is the cosine of x, where x is an angle expressed in radians. *double tan(double x) The return value is the tangent of x, where x is an angle expressed in radians. *double asin(double x) The return value is the inverse sine or arcsine of x. x must be in the interval . The return value will be expressed in radians and lie in the interval . *double acos(double x) The return value is the inverse cosine or arccosine of x. x must be in the interval . The return value will be expressed in radians and lie in the interval . *double atan(double x) The return value is the inverse tangent or arctangent of x. The return value will be expressed in radians and lie in the interval . *double atan2(double y, double x) The return value is the inverse tangent or arctangent of y/x. The return value will be expressed in radians and lie in the interval . The advantage of atan2() over atan() is that, because it is given separate access to the two original (rectangular) co-ordinates, it is able to distinguish beween angles in the left and right half planes (which is not possible if only y/x is passed in, as is the case with atan()). *double exp(double x) The return value is the exponential function of x, i.e. . *double log(double x) The return value is the natural logarithm of x, i.e. . x must be greater than zero. *double log10(double x) The return value is the base 10 logarithm of x, i.e. . x must be greater than zero. *double pow(double x, double y) The return value is x raised to the power y, i.e. . It is an error to invoke this function with x equal to zero and y less than or equal to zero; or with x less than zero and y not an integery is always of type double; but it can have an integer value, e.g. 4.0.. *double sqrt(double x) The return value is the square root of x, i.e. . x must be greater than or equal to zero. Conclusion This essay serves only to skim the surface of the facilities offered by the Standard Library. It is well worth your while to become reasonably familiar with the full range of services which are available in the library: it can speed up programming of many applications very considerably. Copyright This Hypermedia Document is copyrighted, 1994-96, by Barry McMullin http://www.eeng.dcu.ie/7Emcmullin. Permission is hereby granted to access, copy, or store this work, in whole or in part, for purposes of individual private study only. The work may not be accessed or copied, in whole or in part, for commercial purposes, except with the prior written permission of the author.