首页 > 其他 > 详细

A Brief C Language String Study

时间:2021-05-15 19:04:25      阅读:25      评论:0      收藏:0      [点我收藏+]

1. How can we begin to use string in C?

1.1 String, Yes and No

There are no type string in C language. But in C we use an array of characters end with symbol ‘\0‘ to represent a string.

This is a great idea and shows us how minimalism C was designed. However on the other hand, it may also bring difficulty in learning.

Let‘s look at some example code of how array of characters can represent string.

char example[100] = "abc";
puts(example);
// output: abc

You don‘t have to use up all the space in the array (like above). However, your array should have at least enough space for the text and its terminating symbol ‘\0‘.

This enough space means: number of characters plus one. 

char example[4] = "abc";
puts(example);
// output: abc

1.2 Enough Space

If we don‘t prepare enough space, we may meet some trouble. You are losing infomation "silently". This is definitely a bug waiting to happen in the future.

char example[1] = "abc";
// nothing error or warn message. but
// you lose infomation.
// specially, if you have only 3 rooms for "abc". like
// char example[3] = "abc"
// you are still losting ‘\0‘. you should have 4 rooms at least.

 

char example[1] = "abc";
puts(example);
// error message: stack-buffer-overflow
// using printf("%s", example) will smoothly output "a".
// but still you are losting infomation. if you only
// need first letter on purpose, 
// you should not write this line like this after all!

1.3 Pointer is more general than array

In C, we also have another more general way to write an array. That is pointer. So we can also use a pointer to represent a string.

char* example = "abc";
puts(example);
// output: abc

Moreover, we can arrange space by ourself if we need. Just like other normal arrays.

char* example = malloc(4 * sizeof(char));

/* this below process can be as complex as you need */
example[0] = a;
example[1] = b;
example[2] = c;
example[3] = \0;
puts(example);
// output: abc

So roughly speaking, we can directly treat type char* as type string in many places.

We can even use #define like below. But I don‘t think it‘s useful because you may confuse yourself and other people when the code is complex.

#define string char*

string example = "this apple is sweet";
puts(example);
// output: this apple is sweet

 

#define string char*

string example[4];
example[0] = "this";
example[1] = "apple";
example[2] = "is";
example[3] = "sweet";
for(int i = 0; i < 4; i++) {
    printf("%s ", example[i]);
}
// output: this apple is sweet

People often interchangeably use array and pointer to show the same thing. We can prepare ourself if we see these syntax.

char* example = "abc";
puts(example);
// output: abc

char example2[] = "abc";
puts(example2);
// output: abc
    
char example3[4] = "abc";
puts(example3);
// output: abc

1.4 Preallocate more space if you need

Finally, if you do not know how much space you need at first, you can pre allocate space more than theory.

They will not be wasted because you can always use realloc() at the last step to give those unused space back to system.

char* example = malloc(100 * sizeof(char));
int index = 0;

/* this process can be as complex as you need */
example[index] = a;
index++;
example[index] = b;
index++;
example[index] = c;
index++;

/* we pre requested a lot more space. we can give them
   back to system by realloc() */
example = realloc(example, index + 1);

example[index] = \0;
index++;
puts(example);
// output: abc

 

2. What are the useful functions in manipulating strings?

2.1 strcpy(), strncpy(), memcpy()

Although a string variable can be initialized by "=", we can also initialize it with strcpy().

strcpy() has two variables. The first one is destination and the second one is source.

char example[4];
strcpy(example, "abc");
puts(example);
// output: abc

char* example = malloc(4 * sizeof(char));
strcpy(example, "abc");
puts(example)
// output: abc

If our destination string is shorter than the source string, we will see errors. That is, if we don‘t prepare enough space for the text, the function will return errors.

That means string will not be lost "silently" as in "=". It is more safer now. 

char* example = malloc(3 * sizeof(char));
strcpy(example, "abc");
// output: *** buffer overflow detected ***: terminated

On the other hand, if our source string is shorter than destination string, our destination string will be truncated. That‘s the effect of ‘\0‘ from source string.

char example[30] = "this apple is sweet" ;
strcpy(example, "abc");
puts(example);
// output: abc

// however, the other memory still unchanged
for(int i = 5; i < 10; i++) {
    printf("%c", example[i]);
}
// output: apple

strcpy() has a more controlable version strncpy(). It has one more argument than the original one. The n in it‘s name means maximun number of source string you want to be copied.

Because of you can arbitary give a n, the source string may not include ‘\0‘. If this is the case, the destination string will not be truncated but be overlaped.

char example[30] = "this apple is sweet" ;
strncpy(example, "abcdefg", 3);
puts(example)
// output: abcs apple is sweet

example[3] = \0;
puts(example);
// output: abc

strncpy() is similar to another function called memcpy(). memcpy() is more general and has three arguments like strncpy() as well.

char example[30] = "this apple is sweet" ;
memcpy(example, "abc", 3); // ‘\0‘ in source string not be copied.
puts(example);
// output: abcs apple is sweet

memcpy(example, "abc", 4); // ‘\0‘ in source string be copied.
puts(example);
// output: abc

2.2 Substring

In C++ we can use substr() by default. This function is very useful, but we have no such function in C.

Luckly we can build our own strsub() function easily using the knowledge of strncpy() or memcpy().

// C++ code example!
string example = "this apple is sweet";
string ex = example.substr(0, 5);
cout << ex << endl;
// output: this

 

// C code
/* the begin position will be included but the end position will not */
char* substr(char* str, int begin, int end) {
    char* sub = malloc((end - begin + 1) * sizeof(char));
    memcpy(sub, str + begin, end - begin); // strncpy() can also do
    sub[end - begin] = \0;
    return sub;
}

int main(void) {
    char* example = "this apple is sweet";
    char* ex = substr(example, 0, 5);
    puts(ex);
    
    return 0;
}
// output: this

2.3 Split

Another useful string tool is split. It breaks a string into pieces of smaller strings of array by the given delim. But again in C (and C++) we don‘t have split() function by default.

There is a function called strtok(), which is a little not too obvious when you try to use. We can wrap it by ourself.

One notice here, because C cannot determine how long an array is, so we use an additional interger variable to specify it ourself.

char** split(char* str, char* delim, int* returnSize) {
    char** splited = malloc(strlen(str) * sizeof(char*)); // over pre-allocate space
    int index = 0;
    /* ref: http://www.cplusplus.com/reference/cstring/strtok/?kw=strtok */
    char* pch;
    pch = strtok(str, delim);
    for(; pch != NULL;) {
        splited[index] = pch;
        index++;
        pch = strtok(NULL, delim);
    }
    *returnSize = index;
    return splited;
}

int main(void) {
    char example[] ="this apple is sweet";
    
    int splitedSize;
    char** splited = split(example, " ", &splitedSize);
    
    for(int i = 0; i < splitedSize; i++) {
        puts(splited[i]);
    }
    
    return 0;
}
// output: 
// this
// apple
// is
// sweet

2.3 Concatenate

Nowaday we are familiar with string + string, but actually this is kind of preasummption you are doing string concatenate. In C we do it more clearly with a function strcat().

strcat() has two arguments, first one is first string and it is the destination of concatenated string. Second one is the second string.

One thing we should care about is the destination string should have enough space for the concatenated result. Or it will return errors.

char example[100] = "this apple is";
char* example2 = "sweet";
char* example3 = strcat(example, example2);
puts(example3);
// output: this apple is sweet

 

char* example = "this apple is";
char* example2 = "sweet";
char* example3 = strcat(example, example2);
puts(example3);
// you will see: AddressSanitizer:DEADLYSIGNAL

2.5 Match/ Detect

If we want to do some text cleaning, a match or detect function is very useful. In C we have a function called strstr().

Originally it was designed for pointer, but it can also directly use as a string match function in if-else statement.

char example[] = "this apple is sweet";
if(strstr(example, "apple")) {
    puts("Found");
} else {
    puts("Not Found");
}
// output: Found

 

A Brief C Language String Study

原文:https://www.cnblogs.com/drvongoosewing/p/14771296.html

(0)
(0)
   
举报
评论 一句话评论(0
关于我们 - 联系我们 - 留言反馈 - 联系我们:wmxa8@hotmail.com
© 2014 bubuko.com 版权所有
打开技术之扣,分享程序人生!