C and assembler on Linux

From Noisebridge
(Difference between revisions)
Jump to: navigation, search
(Replaced content with 'This is a page for the C and assembler on linux class, Tuesdays at 5:30 PM in Church.')
Line 1: Line 1:
 
This is a page for the C and assembler on linux class, Tuesdays at 5:30 PM in Church.
 
This is a page for the C and assembler on linux class, Tuesdays at 5:30 PM in Church.
 
Here's a write up that covers the first half of
 
the first C on Linux class that I gave last Tuesday
 
20120619 in the Church classroom from 5:30 to 7 PM.
 
  I hope to write up the balance of last Tuesday's
 
class before the weekend's out.
 
  Note the To: list, please; if you know of anyone
 
who's missing, please let them and me know.
 
  Complaints, suggestions, sarcasms, all are welcome.
 
 
jim
 
415 823 4590 my cellphone, call anytime
 
 
 
Learning C programming on Linux
 
 
* C programming language is a specification that defines keywords,
 
operators, and rules of syntax.
 
  This may sound stupidly obvious or useless knowledge, but you may,
 
if you really get into using C, find that it's a practical
 
concept--useful, intelligently obvious.
 
 
* C compiler is a software program that implements the C specification:
 
parser, keywords, operators, syntax rules.
 
  The practical purpose of this idea is that there are different C
 
compilers for different machines and for different purposes. If you're
 
just starting to learn C, this idea will seem pretty nearly as useless
 
as the idea that C is a specification.
 
 
 
  The tools you use to write C programs include an editor and a C
 
compiler at minimum. There are a lot more tools available, such as
 
debuggers and profilers and more.
 
 
  The process you follow is to use a text editor to write some ASCII
 
text that complies with the rules of the C language then use a C
 
compiler to read your ASCII file and create a new file that contains
 
executable machine code.
 
  Look for C compiler-generated error messages. If there are any, even
 
one, then the compiler does not make an executable file; you have to fix
 
all errors. You may see warning messages that indicate the compiler
 
found one or more things that are not perfect but let the compiler
 
continue. If you don't have too many warning messages, the compiler will
 
probably make the executable file.
 
  If you get an executable file, run it and see if it works as you
 
expect. If it does, you probably won't learn anything more from this
 
exercise. If it doesn't, you get to learn about runtime and logic
 
errors: you wrote a program that is correct according to the C language
 
but incorrect in terms of implementing what you hoped it would do.
 
 
  The following commands exemplify the process using a bash shell:
 
$ vi myfile.c
 
$ gcc myfile.c
 
$ ls
 
a.out
 
$ chmod 755 a.out
 
$ ./a.out
 
 
 
  You use a text editor such as vi to create a file of text that
 
conforms to the rules of the C specification.
 
 
  You run the C compiler so that it reads what you wrote. The C
 
compiler sees your program file as an ASCII character stream that it
 
interprets as a token stream.
 
  So, what is a "token"? A token is one or more ASCII characters that
 
the compiler sees as a meaningful thing. To compare with the English
 
language, think of a token as a word or a word ending or punctuation or
 
some other element that's meaningful.
 
 
  The C compiler is a software program that conforms to a particular
 
design: the design for interpreters and compilers. Generally, any
 
compiler or interpreter includes an input stage that parses the incoming
 
ASCII (token) stream and also has a set of keywords and operators that
 
are reserved ASCII character(s) and a set of rules that the compiler
 
applies to the tokens it reads.
 
  When the compiler begins, it sets itself to a neutral state, which
 
is to say that it will examine the first ASCII characters to verify that
 
it can parse it as a stream of tokens.
 
  When the compiler identifies the first token, it verifies that that
 
token is of a class that can be a first token and then resets its (the
 
compiler's) state so that the following token must be one of a limited
 
set of tokens. For example:
 
1+2
 
  The compiler reads the 1 and then the '+' character, at which point
 
it determines that it
 
has at least one valid token:1. The compiler continues reading and sees
 
the 2 and determines that it now has two tokens, 1 and '+'. The 1 token
 
is an integer type of data the value of which is 1. The '+' token,
 
because it occurs between the 1 and the 2 represents the addition
 
operator. The compiler continues reading to find only whitespace and
 
then is able to identify the ASCII stream as a set of three tokens--a
 
value, an operator, and a value--that together form an expression.
 
  An expression is at least one operand and zero or more operators
 
that must be resolved to a single value.
 
  The compiler resolves the expression 1+2 to be a single value of 3.
 
  If you know how to write a C program that is exactly 1+2 and nothing
 
else, it's very likely your compiler will generate an error message
 
(remember, a compiler implements the C programming language
 
specification, and does so in its own way--the C specification is
 
deliberately permissive in some aspects of implementation).
 
  If you get an error message, very likely it will be a complaint that
 
there's not a complete statement or there's a problem at the end of the
 
file or some such.
 
 
  The C compiler is designed to read statements. A statement is a set
 
of valid tokens that follow the rules of the C programming language and
 
end with a statement termination character, which is the ; character.
 
  Try revising your program to read
 
1+2;
 
  The 1+2 is an expression: the C compiler sees 1 followed by +
 
followed by 2 and verifies that this is a valid sequence of tokens that
 
makes an expression. It interprets the ; character as a statement
 
terminator, which means the compiler creates the machine code for the
 
expression and resets itself to a neutral state, ready to read the next
 
statement (ASCII character stream of valid tokens).
 
  The compiler may compile the program with only warning messages. If
 
so, it will make a new file that is named a.out. It is not a loadable
 
program, nor is it executable. Very likely the entire contents is 3,
 
which means the compiler did the addition as it did the compiling. You
 
may think that the compiler would leave the 1+2 in the file as data and
 
machine instructions that the CPU runs to create the sum, 3. That the
 
compiler does the arithmetic before it is done is a matter of
 
optimization.
 
 
  The C compiler generally runs in four different phases:
 
1 preprocessor
 
2 compiler
 
3 optimizer
 
4 linker
 
 
  Consider the program:
 
1+2;
 
  The preprocessor runs and sees nothing to do.
 
  The compiler runs and translates the ASCII to data and machine code,
 
which properly is a set of 1 bits and 0 bits that represent integer 1,
 
integer 2, and the operation of addition.
 
  The optimizer recognizes that this expression can be resolved now
 
without doing any harm to any other parts of the program, so the
 
optimizer replaces the code with the integer value of 3.
 
  The linker runs and does nothing: there is no code to which to link
 
this module.
 
 
  Consider the following program:
 
1+2
 
3 + 4 ;
 
  How many statements do you see? How many expressions? How many
 
tokens?
 
 
  There is a single statement that has two expressions and a total of
 
seven tokens: 1, +, 2, 3, +, 4, and ; (we're not counting the space
 
characters or the newline characters).
 
  Note that the C compiler sees 1+2 and 3 + 4 identically: two
 
expressions that add two integer values together. Very likely the
 
resulting program will effectively be 3 7 after the optimizer pass does
 
its thing.
 
  Note that the 3 and the 7 are there in the program but the program
 
does nothing with them.
 
  Now it may be that the optimizer of your compiler detects that there
 
are no machine operations for the CPU and the optimizer might eliminate
 
the data itself. I doubt it, as it's possible that you may want to make
 
a file that contains only data and link it to one or more other programs
 
that you'll write at some time.
 
 
  The discussion so far includes the terms ASCII stream, token stream,
 
values, operands, operators, expressions, statements, and the four
 
compiler passes: preprocessor, compiler, optimizer, and linker.
 

Revision as of 14:26, 22 June 2012

This is a page for the C and assembler on linux class, Tuesdays at 5:30 PM in Church.

Personal tools