1. How can we begin to use string in C?
1.1 String, Yes and No
There are no type string in C language. But in C we use an array of characters end with symbol ‘\0‘ to represent a string.
This is a great idea and shows us how minimalism C was designed. However on the other hand, it may also bring difficulty in learning.
Let‘s look at some example code of how array of characters can represent string.
char example[100] = "abc"; puts(example); // output: abc
You don‘t have to use up all the space in the array (like above). However, your array should have at least enough space for the text and its terminating symbol ‘\0‘.
This enough space means: number of characters plus one.
char example[4] = "abc"; puts(example); // output: abc
1.2 Enough Space
If we don‘t prepare enough space, we may meet some trouble. You are losing infomation "silently". This is definitely a bug waiting to happen in the future.
char example[1] = "abc"; // nothing error or warn message. but // you lose infomation. // specially, if you have only 3 rooms for "abc". like // char example[3] = "abc" // you are still losting ‘\0‘. you should have 4 rooms at least.
char example[1] = "abc"; puts(example); // error message: stack-buffer-overflow // using printf("%s", example) will smoothly output "a". // but still you are losting infomation. if you only // need first letter on purpose, // you should not write this line like this after all!
1.3 Pointer is more general than array
In C, we also have another more general way to write an array. That is pointer. So we can also use a pointer to represent a string.
char* example = "abc"; puts(example); // output: abc
Moreover, we can arrange space by ourself if we need. Just like other normal arrays.
char* example = malloc(4 * sizeof(char)); /* this below process can be as complex as you need */ example[0] = ‘a‘; example[1] = ‘b‘; example[2] = ‘c‘; example[3] = ‘\0‘; puts(example); // output: abc
So roughly speaking, we can directly treat type char* as type string in many places.
We can even use #define like below. But I don‘t think it‘s useful because you may confuse yourself and other people when the code is complex.
#define string char* string example = "this apple is sweet"; puts(example); // output: this apple is sweet
#define string char* string example[4]; example[0] = "this"; example[1] = "apple"; example[2] = "is"; example[3] = "sweet"; for(int i = 0; i < 4; i++) { printf("%s ", example[i]); } // output: this apple is sweet
People often interchangeably use array and pointer to show the same thing. We can prepare ourself if we see these syntax.
char* example = "abc"; puts(example); // output: abc char example2[] = "abc"; puts(example2); // output: abc char example3[4] = "abc"; puts(example3); // output: abc
1.4 Preallocate more space if you need
Finally, if you do not know how much space you need at first, you can pre allocate space more than theory.
They will not be wasted because you can always use realloc() at the last step to give those unused space back to system.
char* example = malloc(100 * sizeof(char)); int index = 0; /* this process can be as complex as you need */ example[index] = ‘a‘; index++; example[index] = ‘b‘; index++; example[index] = ‘c‘; index++; /* we pre requested a lot more space. we can give them back to system by realloc() */ example = realloc(example, index + 1); example[index] = ‘\0‘; index++; puts(example); // output: abc
2. What are the useful functions in manipulating strings?
2.1 strcpy(), strncpy(), memcpy()
Although a string variable can be initialized by "=", we can also initialize it with strcpy().
strcpy() has two variables. The first one is destination and the second one is source.
char example[4]; strcpy(example, "abc"); puts(example); // output: abc char* example = malloc(4 * sizeof(char)); strcpy(example, "abc"); puts(example) // output: abc
If our destination string is shorter than the source string, we will see errors. That is, if we don‘t prepare enough space for the text, the function will return errors.
That means string will not be lost "silently" as in "=". It is more safer now.
char* example = malloc(3 * sizeof(char)); strcpy(example, "abc"); // output: *** buffer overflow detected ***: terminated
On the other hand, if our source string is shorter than destination string, our destination string will be truncated. That‘s the effect of ‘\0‘ from source string.
char example[30] = "this apple is sweet" ; strcpy(example, "abc"); puts(example); // output: abc // however, the other memory still unchanged for(int i = 5; i < 10; i++) { printf("%c", example[i]); } // output: apple
strcpy() has a more controlable version strncpy(). It has one more argument than the original one. The n in it‘s name means maximun number of source string you want to be copied.
Because of you can arbitary give a n, the source string may not include ‘\0‘. If this is the case, the destination string will not be truncated but be overlaped.
char example[30] = "this apple is sweet" ; strncpy(example, "abcdefg", 3); puts(example) // output: abcs apple is sweet example[3] = ‘\0‘; puts(example); // output: abc
strncpy() is similar to another function called memcpy(). memcpy() is more general and has three arguments like strncpy() as well.
char example[30] = "this apple is sweet" ; memcpy(example, "abc", 3); // ‘\0‘ in source string not be copied. puts(example); // output: abcs apple is sweet memcpy(example, "abc", 4); // ‘\0‘ in source string be copied. puts(example); // output: abc
2.2 Substring
In C++ we can use substr() by default. This function is very useful, but we have no such function in C.
Luckly we can build our own strsub() function easily using the knowledge of strncpy() or memcpy().
// C++ code example! string example = "this apple is sweet"; string ex = example.substr(0, 5); cout << ex << endl; // output: this
// C code /* the begin position will be included but the end position will not */ char* substr(char* str, int begin, int end) { char* sub = malloc((end - begin + 1) * sizeof(char)); memcpy(sub, str + begin, end - begin); // strncpy() can also do sub[end - begin] = ‘\0‘; return sub; } int main(void) { char* example = "this apple is sweet"; char* ex = substr(example, 0, 5); puts(ex); return 0; } // output: this
2.3 Split
Another useful string tool is split. It breaks a string into pieces of smaller strings of array by the given delim. But again in C (and C++) we don‘t have split() function by default.
There is a function called strtok(), which is a little not too obvious when you try to use. We can wrap it by ourself.
One notice here, because C cannot determine how long an array is, so we use an additional interger variable to specify it ourself.
char** split(char* str, char* delim, int* returnSize) { char** splited = malloc(strlen(str) * sizeof(char*)); // over pre-allocate space int index = 0; /* ref: http://www.cplusplus.com/reference/cstring/strtok/?kw=strtok */ char* pch; pch = strtok(str, delim); for(; pch != NULL;) { splited[index] = pch; index++; pch = strtok(NULL, delim); } *returnSize = index; return splited; } int main(void) { char example[] ="this apple is sweet"; int splitedSize; char** splited = split(example, " ", &splitedSize); for(int i = 0; i < splitedSize; i++) { puts(splited[i]); } return 0; } // output: // this // apple // is // sweet
2.3 Concatenate
Nowaday we are familiar with string + string, but actually this is kind of preasummption you are doing string concatenate. In C we do it more clearly with a function strcat().
strcat() has two arguments, first one is first string and it is the destination of concatenated string. Second one is the second string.
One thing we should care about is the destination string should have enough space for the concatenated result. Or it will return errors.
char example[100] = "this apple is"; char* example2 = "sweet"; char* example3 = strcat(example, example2); puts(example3); // output: this apple is sweet
char* example = "this apple is"; char* example2 = "sweet"; char* example3 = strcat(example, example2); puts(example3); // you will see: AddressSanitizer:DEADLYSIGNAL
2.5 Match/ Detect
If we want to do some text cleaning, a match or detect function is very useful. In C we have a function called strstr().
Originally it was designed for pointer, but it can also directly use as a string match function in if-else statement.
char example[] = "this apple is sweet"; if(strstr(example, "apple")) { puts("Found"); } else { puts("Not Found"); } // output: Found
A Brief C Language String Study
原文:https://www.cnblogs.com/drvongoosewing/p/14771296.html