代写C语言程序,练习String的用法,以及编译器的使用。
Writing and structuring programs
You should keep the following principles in mind as you develop larger and
more complex programs:
- Choose descriptive names for your variables and functions
Self documenting code is easier to read and interpret. Code tells you how,
comments tell you why. - Don’t repeat yourself
Refactor common code into functions so you don’t need to repeat yourself many
times. - Avoid creating large functions
Split up large blocks of code into smaller functions. The Unix philosophy
comes into play here, you should aim to create small, concise functions that
focus on doing one thing and doing that one thing well. - Prefer concise code
Bigger functions and programs generally take more effort for a human to
interpret. - Prefer immutability
Use const liberally. Delegate as much work to the compiler as possible, let it
check invariants for you. - Don’t reinvent the wheel, use the standard library
Do you know what functions that come with the C standard library? Spend some
time looking though the documentation for the C standard library so you don’t
end up recreating a function that already exists. - Use widely accepted naming and coding conventions for the language you are working in
For example i, j, k are typically reserved for looping variables. It is
expected that functions that take non const pointers will mutate them, so mark
them as const if your function only needs to read access. - Be consistent
Use a single standard naming and indention convention throughout your entire
codebase
Reading list
- Code Complete by Steve McConnell
- The Art of Unix Programming by Eric Raymond
- The Practice of Programming by Brian Kernighan and Rob Pike
- The C Programming Language (K & R) by Brian Kernighan and Dennis Ritchie
- Computer Systems - A Programmer’s Perspective by Randal Bryant and David O’Hallaron
String parsing in C
The C standard library provides the following string functions. Remember to
compile with -std=c11.
#include <stdio.h>
char * fgets(char * str, int num, FILE * stream);
int sscanf(const char * str, const char * format, …);
#include <stdlib.h>
void free(void * ptr);
void * malloc(size_t size);
void * calloc(size_t count, size_t size);
void * realloc(void * ptr, size_t size);
int atoi(const char * s);
long atol(const char * s);
#include <string.h>
size_t strlen(const char * s);
int strcmp(const char * s1, const char * s2);
char * strsep(char ** stringp, const char * delim);
char * strcat(char * restrict s1, const char * restrict s2);
char * strcpy(char * restrict s1, const char * restrict s2);
char * strtok(char * restrict s1, const char * restrict s2);
int memcmp(const void * s1, const void * s2, size_t n);
void * memset(void * b, int c, size_t len);
void * memmove(void * dst, const void * src, size_t len);
void * memcpy(void * restrict dst, const void * restrict src, size_t n);
—|—
The GNU extensions provide the some additional functions. Remember to compile
with -std=gnu11.
ssize_t getline(char ** lineptr, size_t * linecap, FILE * stream);
ssize_t getdelim(char ** lineptr, size_t * linecap, int delim, FILE * stream);
char * strfry(char * string);
char * strdup(const char * s);
char * stpcpy(char * dest, const char * src);
char * strsep(char ** stringp, const char * delim);
int strcasecmp(const char * s1, const char * s2);
—|—
Exercise 1: String function implementations
Write your own implementation of the atoi, strlen, strcpy, strtok and
strcasecmp functions.
When you have implemented these functions, you can compare your code to the
implementations in glibc.
The C compiler pipeline
Let’s explore what the compiler does behind the scenes when we create a more
complex program.
Makefile - builds the program
CC=clang
CFLAGS=-g -std=c11 -Wall -Werror
TARGET=tasks
.PHONY: clean
all: $(TARGET)
clean:
rm -f $(TARGET)
rm -f .o
list.o: list.c
$(CC) -c $(CFLAGS) $^ -o $@
tasks.o: tasks.c
$(CC) -c $(CFLAGS) $^ -o $@
tasks: tasks.o list.o
$(CC) $(CFLAGS) $(LDFLAGS) $^ -o $@
—|—
tasks.c - the scaffold code for the task list application
#include <stdio.h>
#include <stdlib.h>
#include “list.h”
int main(void) {
// …
return 0;
}
—|—
list.c - the implementation of the circular linked list
#include “list.h”
// Initializes an empty circular linked list.
void list_init(node head) {
head->next = head;
head->prev = head;
}
// …
—|—
list.h - function prototypes for a circular linked list
#ifndef LIST_H
#define LIST_H
#include <stdbool.h>
typedef struct node node;
struct node {
void * data;
node* next;
node* prev;
};
// Initializes an empty circular linked list.
void list_init(node* head);
// Inserts given node before the head.
void list_push(node* head, node* n);
// Inserts given node after the head.
void list_append(node* head, node* n);
// Removes the given node from the list.
void list_delete(node* n);
// Returns whether the list is empty.
bool list_empty(node* head);
#endif
—|—
Running the make creates the object files tasks.o and list.o and then finally
the tasks program.
$ make
clang -c -g -std=c11 -Wall -Werror tasks.c -o tasks.o
clang -c -g -std=c11 -Wall -Werror list.c -o list.o
clang -g -std=c11 -Wall -Werror tasks.o list.o -o task
Preprocessor
Your code is first processed through the C preprocessor. This executes all of
the preprocessor directives.
You can examine the raw output of the preprocessor by calling it directly:
$ cpp tasks.c
Or by instructing the compiler to only perform the preprocessing step.
$ clang -E tasks.c
This output is very helpful when debugging the problems related to macros and
other preprocessor utilities.
Exercise 2: C preprocessor
- What does the #include directive do?
- What are include guards and when should they be used?
- We have seen how the #define directive can be used to create compile time constants.
The #define directive can also be used to create macros.
#define PI 3.14
#define NUM 42
#define STR “string”
#define MIN(a,b) ((a < b) ? (a) : (b))
#define MAX_BUFFER 1024
—|—
Similar to the #define directives, macros are substituted into their call site
in a very similar manner to text search and replace. Why are the extra
brackets around a, b anda < b
necessary in the macro definition for MIN?
For example what happens with MIN(a++, 1))
Code generation and assembly
The -c flag on clang asks the compiler to preprocess the C code, generate
assembly and finally assemble the result into an object file. The object files
contain machine code - assembly in binary format for the target CPU. We need
to create an object file for every translation unit in our source code (every
.c file is a translation unit).
You can ask the compiler to stop after assembly generation with the following
command:
$ clang -S -g -std=c11 -Wall -Werror list.c
This command produces list.s - the assembly generated from list.c. clang calls
the assembler behind the scenes to turn this into machine code for object
file.
You can also extract assembly from object files with objdump. Assembly files
have two different syntaxes that are equivalent in functionality. objdump
defaults to the AT&T syntax but can also output the Intel syntax.
$ clang -c -g -std=c11 -Wall -Werror list.c
$ objdump -M intel -S
Since we have compiled with -g debugging symbols. objdump can annotate the
assembly with the source. Remember that compiling with address sanitizer will
affect the source code annotation and output of objdump.
Linker
Now we have two compiled object files, one for each translation unit. The
linking stage merges these object files together to generate the executable.
Behind the scenes, clang calls the ld linker to perform this task.
Since we often need to use variables and functions that are declared in
another translation unit, C defines the concept of linkage. The job of the
linker is to connect these translation units together.
- A variable or function has internal linkage if it is defined in the current translation unit.
- A variable or function has external linkage if it is defined in another translation unit.
- Any variable or function that is declared static has internal linkage, it is good practice to declare every variable or function as static unless it needs to be accessible from another translation unit.
Exercise 3: Declarations, definitions and linkage
- Which of these are declarations and which are definitions?
- Classify the linkages in the above declarations as internal or external.
- Which definitions are accessible from another translation unit in the above C file?
- What happens if the linker can’t find a function that has external linkage?
- Header files often contain only declarations. There is nothing stopping us from putting definitions into the header as well. When would this be useful?
Exercise 4: Task list application
Create an interactive task list application from the provided scaffold.
- Your application should load tasks.txt from the current directory and present each line as a task.
- Your application should prompt for commands (help, new, delete, move, undo) which manipulate the list.
- Your application should save the updated task list and exit once it encounters EOF on stdin.
- Your application should be able to handle lines of any length.
Note: since C does not have generics, we have edited the linked list to store
the void* data type, now you can use it to store any pointer type. However,
this means that you now have to do more than one allocation for every element
stored in the list, which isn’t very efficient. You can trivially upgrade
list.h to the version used in the Linux kernel with some preprocessor tricks
to prevent the need for any double allocations.