Title: Introduction to C Lecture 2: Elements and Syntax of the Language
1Introduction to CLecture 2 Elements and Syntax
of the Language
2A Picture of the Computers Memory
- The computer stores all information as a sequence
of binary numbers a series of zeroes and ones
(actually the on and off states of transistor
switches) in its memory. Each number is split
into bite-sized chunks, or bytes, that the
computer can digest. Each zero or one is a bit
(binary digit), and each byte is eight bits
(hence can be a number from 0 to 255). The
decimal number 35, which is 100011 in binary, is
stored in one byte as 00100011 in the memory. - You can picture the memory as an enormous string
of numbers that you can read and change. Each
number has an address each memory address stores
a value, a sequence of zeroes and ones, which can
be interpreted in different ways if it is a
floating point number, part of it will be
interpreted as the exponent but if you look at
that same sequence of zeroes and ones and
interpret it as an integer, it will be treated as
a single whole number with no decimal places and
no exponent. The interpretation is different
again if the sequence represents a string of
characters. - Later in this lecture well look closely at the
computers interpretation of the bits stored in
its memory. Lets start by looking at the
different elements of the C language.
Value
Address
3Lexical Elements of C
- Basic vocabulary consists of tokens, of which
there are six kinds - keywords (reserved words you cant use for
anything else) - identifiers (e.g. variable names, function
names like main, cos ...) - constants (e.g. the number 5)
- string constants (e.g. Hello\n)
- operators (e.g. , -, , and the
parentheses following function names) - punctuators (e.g. )
- The compiler either ignores white space (spaces,
tabs, linefeed etc) or uses it to separate
tokens. NB If you write your code in a Windows
editor, you may end up with ctrl-M characters at
the end of each line, causing your Unix C
compiler to complain!
4Comments
- Comments are strings of symbols placed between
the delimiters / and /. The compiler changes
each comment into a single blank character. Note
that some compilers accept the C convention in
which comments may begin with // and run to the
end of the line do not use this if you want your
code to be at all portable! - /------------------------------------------------
------------------------------------/
5Keywords
- Keywords are reserved words with strict meanings
that may not be redefined or used in other
contexts. They are -
- auto break case char
- const continue default do
- double else enum extern
- float for goto if
- int long register return
- short signed sizeof static
- struct switch typedef union
- unsigned void volatile while
- Dont worry, Im not asking you to remember them
all! Just be aware that they are there, and
dont use them for anything else, e.g. variable
names.
6Identifiers
- An identifier a variable name or function name
is a sequence of letters, digits and
underscores no other characters. Rules - The first character must not be a digit
- you are also strongly advised not to begin
identifiers with an underscore, as it may cause
conflict with some system names. - Lower- and uppercase characters are distinct.
- Give variables names that make sense! Call them
things like relative_angle or spin_z, not
just q or spz which wont make immediate
sense to you later on. -
- Exercise Which of the following are identifiers?
- k
- _id
- notme
- 101_south
- i_am_an_identifier
- so-am-i
- Names of standard library functions such as
printf should not normally be redefined. -
- Note that on some old systems, only the first 8
characters of an identifier are used in ANSI C,
at least the first 31 are used.
7Constants
- As well as variables x that change as the program
runs, you will often need to use constants, e.g.
e 1.6022e-19, that do not change. As well as
the usual integer or floating-point constants (0,
23 34.682...), constants may be characters such
as letters, etc., or the so-called escape
characters that are written with a backslash,
such as \n (newline). -
- Incidentally, C provides octal and hexadecimal as
well as decimal integers 17 is decimal, 017 is
octal and 0x17 is hexadecimal. Negative decimal
integers are considered to be constant
expressions.
8String constants
- A sequence of characters enclosed by double
quotes, e.g. abc, is a string constant or
string literal. It is stored by the compiler as
an array of characters. - String constants are not the same as character
constants thus, a is different from a the
latter is an array (admittedly of size one) of
characters, whereas the former is just a
character. - If a double-quote is to appear in a string, it
must be preceded by a backslash likewise the
backslash character itself. - String constants may not include a (literal)
linefeed thus, this is not
a stringis not a string. If you want to
stretch your string over a line break, you can
end the line with a backslash this has the
effect of continuing it to the next line.
However, it is probably simpler to note that
pairs of string constants separated only by white
space are concatenated into one string so
this is a stringis equivalent to this
is a string.
9Operators
- , -, , / are for addition, subtraction,
multiplication, and division, respectively. - is for modulus (clock arithmetic) (59)12
is 2, i.e. divide and take the remainder. - There are several other operators available note
that and are single operators, even though
they are written with two characters. Examples of
these latter two - i means add one to i
- i 5 means i i 5
- i 10 means i i 10
10Increment and Decrement Operators
- The increment and decrement operators are and
--, respectively. They can be used as both
prefix and postfix operators. Example -
- c 1
- a c
- printf(a d, c d, a,c)
- This code
- increments c, so c becomes 2
- gives a the new value, 2, so it prints out a 2,
c 2. In contrast, -
- c 1
- a c
- printf(a d, c d, a,c)
- sets a to be the present value of c, i.e. 1, and
then - increments c by 1 it prints out a 1, c 2.
- Do not confuse these! You have been warned!
11Precedence and Associativity
- An operator is, obviously, something that
performs an operation on one (or more) numbers to
produce a result of some sort. Examples are ,
. In a given expression, some operators have
priority. - Rule 1 Multiplication and division are done
before addition and subtraction. Thus, 1 2 3
is 7 not 9.There are lots of other operators
(things like , lt, , , and so on) the
textbooks have a long list of rules of which are
done first (see, e.g., Appendix E of KP). In
order to avoid bugs, therefore - Rule 2 For anything else, use brackets!
12Precedence and Associativity
- An operator is, obviously, something that
performs an operation on one (or more) numbers to
produce a result of some sort. Examples are ,
. In a given expression, some operators have
priority. - Rule 1 Multiplication and division are done
before addition and subtraction. Thus, 1 2 3
is 7 not 9.There are lots of other operators
(things like , lt, , , and so on) the
textbooks have a long list of rules of which are
done first (see, e.g., Appendix E of KP). In
order to avoid bugs, therefore - Rule 2 For anything else, use brackets!
Example j k 3 is equivalent to j j
(k 3) and not to j j k 3 this is
because has higher precedence than . A
clear case for using parentheses!
13Expressions and Statements
- An expression is most easily defined by example.
The expression k 3 is actually a
self-contained unit that has the value 3. - Let me repeat that.
- The whole expression (not just the variable k)
has a value (in this case, 3).
14Expressions and Statements
- An expression is most easily defined by example.
The expression - k 3
- is actually a self-contained unit that has the
value 3. - Let me repeat that.
- The whole expression (not just the variable k)
has a value (in this case, 3). - If you put a semicolon at the end of it, it
becomes a statement. Note that statements do not
have a value. -
- Thus, the statement
- k 3
- does two things
- it assigns the value 3 to the variable k
- it assigns the value 3 to the whole expression k
3. - However, the statement k3 (with semicolon)
does not have a value.
15The Value of Expressions, contd
- Example
- b 2
- c 3
- a b c
- may be condensed to
- a (b 2) (c 3)
- since the expression (b 2) not only assigns the
value 2 to the variable b, but also has as a
whole the value 2 likewise for (c 3). In this
way, one can write - a b c 1
- as shorthand for
- a (b (c 1))
- in which first c, and then the expression (c 1)
are assigned the value 1 then b, and in turn the
expression (b 1) and finally the variable a are
all assigned the value 1. Note that the
associativity is right-to-left as the operators
all have equal precedence, operations are carried
out in that order.
16The Value of Expressions, contd
- Example
- b 2
- c 3
- a b c
- may be condensed to
- a (b 2) (c 3)
- since the expression (b 2) not only assigns the
value 2 to the variable b, but also has as a
whole the value 2 likewise for (c 3). In this
way, one can write - a b c 1
- as shorthand for
- a (b (c 1))
- in which first c, and then the expression (c 1)
are assigned the value 1 then b, and in turn the
expression (b 1) and finally the variable a are
all assigned the value 1. Note that the
associativity is right-to-left as the operators
all have equal precedence, operations are carried
out in that order.
Whole expression has value 3
17The Value of Expressions, contd
- Example
- b 2
- c 3
- a b c
- may be condensed to
- a (b 2) (c 3)
- since the expression (b 2) not only assigns the
value 2 to the variable b, but also has as a
whole the value 2 likewise for (c 3). In this
way, one can write - a b c 1
- as shorthand for
- a (b (c 1))
- in which first c, and then the expression (c 1)
are assigned the value 1 then b, and in turn the
expression (b 1) and finally the variable a are
all assigned the value 1. Note that the
associativity is right-to-left as the operators
all have equal precedence, operations are carried
out in that order.
Whole expression has value 3
Whole expression has value 2
18Assignment Operator ()
- The assignment operator in C looks like (and is
usually referred to by the same name as) the
mathematical equals sign but they are not
equivalent. - The mathematical equation x 2 0is not a
legal assignment expression, since the left-hand
side is an expression, not a variable, and it may
not be assigned a value in this way. In
contrast - The statement x x 1is perfectly legal in
C it adds 1 to the current value of x, and
assigns the new value to x. Mathematically,
though, it makes no sense, as you can see by
subtracting x from both sides.
19Example
- Consider this program, from Kelley Pohl
-
- include ltstdio.hgt
-
- int main(void)
-
- int i 0, power 1
-
- while (i lt 10)
- printf(-6d, power 2)
- printf(\n)
- return 0
-
-
- Exercise Copy it into a program called pwr2.c,
try it out, and include comment statements to
explain how it works. Note that while regards
as true anything with non-zero value.
20Data Types
- At the start of a function, all variables must be
declared, and optionally they may be initialized - int a, b, c
- float x, y 10.5, z -6.0
- This tells the compiler to set aside an
appropriate amount of space in memory to store
the values associated with each variable it also
enables the compiler to instruct the machine to
perform specific operations correctly. At the
machine level, the operation of addition of
integers is different than the operation of
floating-point variables. The different
fundamental data types are
21Binary counting
- A decimal number (e.g. 38742) is written as
- dndn-1....d1d0,
- where the di represent individual digits, and it
may be evaluated as - dn ? 10n dn-1 ? 10n-1 ....d1 ? 101 d0 ? 100.
- Likewise, in binary, a number dndn-1....d1d0 may
be evaluated as - dn ? 2n dn-1 ? 2n-1 ....d1 ? 21 d0 ? 20
- but the di here are either 0 or 1. Computers
store and manipulate their data in this format.
Thus, the number 38 1 ? 25 - 0 ? 24 0 ? 23 1 ? 22 1 ? 21 is
represented as 000100110 on an 8-bit machine.
22Data type int
- Old machines tended to use 16 bits (2 bytes) to
store int variables most machines nowadays use
32 (or occasionally 64) bits. Of these n bits,
one is used to store the sign, and the remainder
store the value itself. (The data type unsigned
assumes the number is positive, and uses the
extra bit to double its range.) - It is possible, when the programmer is careless,
for numbers to exceed the possible range of
values that may be stored in a variable, in which
case an integer overflow occurs the program will
usually continue to run, but with results that
are nonsense. -
- For integer constants, as well as the usual
decimal there are also hexadecimal (base 16
digits 0-9, followed by A-F specified by leading
0x, as in 0x1a) and octal (base 8 digits 0-7
specified by leading 0, as in 053). - Suffixes can be appended to integer constants to
specify their type e.g., 37U specifies an
unsigned integer constant.
23Characters
- Variables of any integral type in particular,
char and int can be used to represent character
variables. - In C there are no constants of type char
characters such as a and are of type int. - In addition to representing characters, a
variable of type char can be used to hold small
(1 byte) integer values. - Most machines use the ASCII character code to
represent letters and other characters as small
integer numbers. - Note that there is no correspondence between a
character representing a digit and that digits
intrinsic value the value of 2 is not 2 but
50. - Letters do appear in alphabetical order, which
makes sorting much easier. - Some characters are the so-called escape codes
e.g. the alert character (beep) \a has integer
value 7.
24Floating types
- There are three floating data types
- float
- double
- long double
- Suffixes may be appended to floating constants to
specify their type. - The default working floating type in C is double,
not float. - Integers are representable as floating constants,
but they must be written with a decimal point. - Exponential notation is also available
- 1.2345e6 means 1.2345 x 106.
- A float is usually stored in 4 bytes, and is
therefore accurate to typically 6 decimal places
with a range of 10-38 to 1038. A double is
usually accurate to about 15 decimal places, and
has a range of 10-308 to 10308.
25The Sizeof Operator
- The operator sizeof yields the number of bytes
needed to store an object. Such storage
requirements vary from machine to machine, but it
is always the case that -
- sizeof(char) 1
- sizeof(char) lt sizeof(short) lt sizeof(int) lt
sizeof(long) - sizeof(signed) sizeof(unsigned) sizeof(int)
- sizeof(float) lt sizeof(double) lt sizeof(long
double) -
- The size returned is usually unsigned, as theres
no need to make allowance for the possibility of
negative sizes!
26getchar and putchar
- Read characters from the keyboard and/or to print
them to the screen, one at a time. E.g. - include ltstdio.hgt
- include ltctype.hgt /contains the toupper macro
/ - int main (void)
-
- int c
-
- while ((c getchar()) ! EOF)
- if (c gt a c lt z) toupper(c)
- putchar(c)
-
- return 0
-
- This program works as follows
- c is declared as an integer note that a
character is a small integer (one byte). - c getchar() reads in a character from the
keyboard and assigns it to the variable c, and
also to the expression as a whole. - The while loop continues as long as the character
obtained is not EOF - The if statement tests to see whether c is a
lowercase character if it is, it is changed to
uppercase. - The (uppercase) character is then printed out to
screen.
End-of-File character
27printf
- Example printf(12.2e divided by 12.2e
is.\n,x,y,x/y) - The argument to printf( ) has two parts the
control string, and the rest. - The control string contains specifiers such as
c, d and so on. - Control chars also can include precision and
width specifiers 12.2f specifies a number of
width (at least) 12, of which 2 are decimal
places. - Each control specifier must correspond to an
argument in the remainder of the list. - The function itself returns the number of
characters printed, although that is not often
used.
28printf
- Example printf(12.2e divided by 12.2e
is.\n,x,y,x/y) - The argument to printf( ) has two parts the
control string, and the rest. - The control string contains specifiers such as
c, d and so on. - Control chars also can include precision and
width specifiers 12.2f specifies a number of
width (at least) 12, of which 2 are decimal
places. - Each control specifier must correspond to an
argument in the remainder of the list. - The function itself returns the number of
characters printed, although that is not often
used.
control string
29printf
- Example printf(12.2e divided by 12.2e
is.\n,x,y,x/y) - The argument to printf( ) has two parts the
control string, and the rest. - The control string contains specifiers such as
c, d and so on. - Control chars also can include precision and
width specifiers 12.2f specifies a number of
width (at least) 12, of which 2 are decimal
places. - Each control specifier must correspond to an
argument in the remainder of the list. - The function itself returns the number of
characters printed, although that is not often
used.
the rest.
30printf
- Example printf(12.2e divided by 12.2e
is.\n,x,y,x/y) - The argument to printf( ) has two parts the
control string, and the rest. - The control string contains specifiers such as
c, d and so on. - Control chars also can include precision and
width specifiers 12.2f specifies a number of
width (at least) 12, of which 2 are decimal
places. - Each control specifier must correspond to an
argument in the remainder of the list. - The function itself returns the number of
characters printed, although that is not often
used.
31scanf
- Example scanf(df,x,y)
- The function scanf( ) likewise has a control
string and an arbitrary number of other
arguments. The other arguments must be addresses
in memory places to write to (i.e., usually
preceded by , although the name of an array is
an address in itself so doesnt need the ).
If you want to input a value for the variable x,
if the command were scanf(x), the computer would
have to look up the address of x for itself to
know where to store the value you are about to
give it. Instead, with scanf(x), you give the
compiler the address of x directly. This is a
common source of errors amongst novice
programmers. -
- A table of scanf conversion characters is also in
the handout. Note that white space in the
control string must match white space in the
input stream scanf normally looks for
non-white-space characters. It is assumed that
enough space has been allocated to hold any
strings that are read in this is the
programmers responsibility. The function
returns the number of successful conversions
performed.
32scanf
- Example scanf(df,x,y)
- The function scanf( ) likewise has a control
string and an arbitrary number of other
arguments. The other arguments must be addresses
in memory places to write to (i.e., usually
preceded by , although the name of an array is
an address in itself so doesnt need the ).
If you want to input a value for the variable x,
if the command were scanf(x), the computer would
have to look up the address of x for itself to
know where to store the value you are about to
give it. Instead, with scanf(x), you give the
compiler the address of x directly. This is a
common source of errors amongst novice
programmers. -
- A table of scanf conversion characters is also in
the handout. Note that white space in the
control string must match white space in the
input stream scanf normally looks for
non-white-space characters. It is assumed that
enough space has been allocated to hold any
strings that are read in this is the
programmers responsibility. The function
returns the number of successful conversions
performed.
control string
33scanf
- Example scanf(df,x,y)
- The function scanf( ) likewise has a control
string and an arbitrary number of other
arguments. The other arguments must be addresses
in memory places to write to (i.e., usually
preceded by , although the name of an array is
an address in itself so doesnt need the ).
If you want to input a value for the variable x,
if the command were scanf(x), the computer would
have to look up the address of x for itself to
know where to store the value you are about to
give it. Instead, with scanf(x), you give the
compiler the address of x directly. This is a
common source of errors amongst novice
programmers. -
- A table of scanf conversion characters is also in
the handout. Note that white space in the
control string must match white space in the
input stream scanf normally looks for
non-white-space characters. It is assumed that
enough space has been allocated to hold any
strings that are read in this is the
programmers responsibility. The function
returns the number of successful conversions
performed.
other arguments.
34Mathematical functions
- There are no built-in mathematical functions, but
functions such as -
- sqrt() pow() exp() log() sin() cos() tan()
-
- are built into the mathematics library. All of
these take arguments of type double, and return a
value of type double pow() takes two arguments,
and the others all take one. In order to use
functions from the standard maths library, you
should include ltmath.hgt at the top of the
program, and you have to have -lm as an argument
in your compilation command - cc -lm filename
35Writing style
- You should write your code in a tidy way to make
it nicely readable. In particular, make sure
your indentation is correct this will take care
of a lot of debugging for you. Whenever you open
a curly bracket (i.e. you start a compound
statement), you should indent further lines
underneath indent to the same level until you
either open a new bracket or close the existing
one. emacs knows the rules about how to indent C
properly. Hit the tab key on every line of your
code, and emacs will indent it properly for you.
Missing semicolons, parentheses that have been
opened but not closed and so on will then stick
out like a sore thumb, as they will appear to be
badly indented. -
- Every program you write (and arguably every
subroutine) should have comment lines at the top
stating the name, author, date and purpose of the
code. In this respect, my example programs often
fall short of best practice... -
- Also be sure to include plenty of comments
throughout the code to explain whats going on. -
- If you dont write tidily, youll lose marks on
your homework!
36How not to lose marks on homework
- - Read the debugging techniques/ good programming
practice/ structure of a C program notes in the
handout. -
- - Even though the compiler doesn't care, I will
deduct marks for poor indentation -
- - Ditto for poorly-commented code
-
- - Ditto for code that doesn't have a comment at
the top stating author, date, purpose -
- - Ditto for using global variables
-
- - Ditto for dynamically-allocated memory space
that you don't free before finishing the program
(youll learn about this later on) -
- - Ditto for opening files that you don't close
(also later on) -
- - And of course you lose marks if your code
doesn't compile, and if the compiled code doesn't
run!
37Debugging
- To make your life easier, a few simple ideas
- Write code in small pieces start with your
main( ) function, and calls to some dummy
functions that you intend to fill with real code,
and make sure it works add to it a bit at a
time. - Take note of the compiler warnings and error
messages! They may look cryptic at first, but
they do make sense. In particular, the first
warning of all is important, as warnings after
that may be the result of compiler confusion at
the first error message. And it does give you the
line number, which narrows things down
considerably. Look for missing semicolons! - A coredump crash usually means memory has been
accessed illegally. Did you remember the in
scanf( ) argument? Did you allocate enough
memory to hold your arrays? - You can liberally scatter printf(test 1\n) and
similar statements throughout code, to see at
what point the program crashes, if it does. Make
sure though that you end such lines with \n, as
this makes the program write it out to the
screen if you just write test or similar, it
will store the string in a buffer (while it waits
for you to finish the line) and move on, giving
you no information about where the problem
occurred. -
38Debugging gdb
- A common debugger on most (all?) Unix systems is
gdb. For this, you should use gcc as your
compiler, with the option g - gcc my_program.c g o my_program
- gdb my_program
- run
- You will now be in the debugger and you have
many powerful commands available to tell you what
the problems are. (Type help to see the
categories of commands). - There are debugging programs available on
essentially all compilers. These are probably
essential if you are doing large amounts of
programming however, they require learning a lot
of new tricks, and we dont have time to discuss
them here.