L-Systems in C, Part 2: Simple strings

This is the second post in a series covering my time spent learning C by experimenting with procedurally generating plants using L-Systems. In this post, I put together a binary L-Tree generator using vanilla C strings.

The basic algorithm behind generating strings with context-free L-System grammars is as follows:

Write your axiom to a string l_str
Prepare a string to write to tmp
For each char in l_str look up its production rule r
If found, append r to tmp
If no matching r exists, append char to tmp
Copy tmp to l_str
Clear tmp
Repeat from #3 for the desired number of iterations

To keep things simple, in my first iteration I hardcoded a simple binary tree lookup table for step 3 so that I could focus on figuring out exactly how to allocate enough memory for all those append operations.

First Pass: Stack-Allocated strings

The ultra-simple solution here is to statically define the size of l_str and tmp to something large enough to handle the number of iterations we're interested in:

 1int main(void) {
 2  // Statically allocate 1000 bytes for l_str and tmp
 3  char l_str[1000] = "F"; 
 4  char tmp[1000] = "\0";
 5  for (size_t i = 0; i < 5; i++) {
 6    for (size_t j = 0; j < strlen(l_str); j++) {
 7      switch (l_str[j]) {
 8        case 'F': {
 9          sprintf(tmp, "%sG[+F]-F", tmp);
10          break;
11        }
12        case 'G': {
13          sprintf(tmp, "%sGG", tmp);
14          break;
15        }
16        default: {
17          sprintf(tmp, "%s%c", tmp, l_str[j]);
18          break;
19        }
20      }
21    }
22    // Replace the previous iteration with the current one
23    strcpy(l_str, tmp);
24    // Clear tmp
25    memset(tmp, 0, sizeof(tmp));
26    printf("%lu: %s\n", i, l_str);
27  } 
28  return 0;
29}

This gives the expected result but only works for the first six iterations, after which point our string needs to grow past 1000 characters, and we hit undefined behavior as we write outside our (stack) allocated memory. We could bump up the size of l_str and tmp to fit whatever number of iterations we want (and we have enough memory to handle) but I think it's far more interesting at this point to explore how to dynamically grow our strings. On to the next pass...

Second Pass: `malloc`, `realloc` and `free`

Instead of statically allocating all the memory we need up front, we can dynamically allocate some memory and expand that allocation as needed. We'll have to malloc our initial strings, realloc them as we need them to grow, and remember to free them once we're done.

 1int main(void) {
 2  // Allocate enoguh memory for l_str to hold our axiom
 3  char *l_str = malloc(sizeof(char) * 2);
 4  
 5  // Copy our axiom into l_str
 6  strcpy(l_str, "F");
 7  
 8  // Allocate enough for tmp to hold a null terminator
 9  char *tmp = malloc(sizeof(char) * 1);
10  
11  // Write the null terminator, making this an empty string
12  tmp[0] = '\0';
13  for (size_t i = 0; i < 6; i++) {
14    for (size_t j = 0; j < strlen(l_str); j++) {
15    
16      // Record the current size of tmp 
17      // (including its null terminator)
18      size_t old_len = strlen(tmp) + 1;
19      
20      size_t new_len;
21      switch (l_str[j]) {
22        case 'F': {
23          // Calculate the new required length
24          new_len = old_len + 7;
25          
26          // Reallocate tmp to the new length
27          tmp = realloc(tmp, new_len);
28          
29          // Append our rule to tmp
30          snprintf(tmp, new_len, "%sG[+F]-F", tmp);
31          
32          break;
33        }
34        case 'G': {
35          new_len = old_len + 2;
36          tmp = realloc(tmp, new_len);
37          snprintf(tmp, new_len, "%sGG", tmp);
38          break;
39        }
40        default: {
41          new_len = old_len + 1;
42          tmp = realloc(tmp, new_len);
43          snprintf(tmp, new_len, "%s%c", tmp, l_str[j]);
44          break;
45        }
46      }
47    }
48    // No need to strcpy and memset: we can swap pointers
49    const auto swap = l_str;
50    l_str = tmp;
51    tmp = swap;
52    
53    // Set tmp back to an empty string the cheap way
54    tmp[0] = '\0';
55    printf("%lu: (%lu) %s\n", i, strlen(l_str), l_str);
56  }
57  
58  // Free up our memory once we're done
59  free(tmp);
60  free(l_str);
61  
62  return 0;
63}

A few interesting things to point out:

We need to remember to add 1 to each strlen call to account for null terminators.
We're using snprintf (as opposed to sprintf) to ensure we only write within available memory.
We can "clear" tmp by setting the first byte to a null terminator and ignoring the rest of its allocated memory, effectively turning it into a zero-length string.
Things get really slow as the iteration count increases.

Running a profiler over this shows that the majority of our time is spent on strlen - this makes sense! We're explicitly executing strlen twice per iteration of our inner loop as part of the condition expression and getting the length of tmp, and once more implicitly when we use snprintf to write tmp to itself. strlen works by counting (sequentially) the number of characters before the first null terminator it finds - that's a lot of iteration! We've got some painfully exponential time complexity at play here.

Third Pass: No More `strlen`

The easy fixes here are to explicitly keep track of the length of l_str and tmp, and to use strcpy and some pointer arithmetic to append rules to tmp. Making these changes has a big impact: slowdown only starts becoming noticeable at around 20 iterations, with a string length of ~32,000,000!

 1int main(void) {
 2  char *l_str = malloc(sizeof(char) * 2);
 3  strcpy(l_str, "F");
 4  char *tmp = malloc(sizeof(char) * 1);
 5  tmp[0] = '\0';
 6  size_t tmp_len = 0;
 7  size_t lstr_len = 1;
 8  for (size_t i = 0; i < 20; i++) {
 9    for (size_t j = 0; j < lstr_len; j++) {
10      size_t new_len;
11      switch (l_str[j]) {
12        case 'F': {
13          new_len = tmp_len + 7;
14          
15          // Don't forget space for the null terminator!
16          tmp = realloc(tmp, new_len + 1);
17          
18          // Copy the rule to the address of tmp offset by
19          // the length of tmp, i.e. the end.
20          strcpy(tmp + tmp_len, "G[+F]-F");
21          break;
22        }
23        case 'G': {
24          new_len = tmp_len + 2;
25          tmp = realloc(tmp, new_len + 1);
26          strcpy(tmp + tmp_len, "GG");
27          break;
28        }
29        default: {
30          new_len = tmp_len + 1;
31          tmp = realloc(tmp, new_len + 1);
32          tmp[tmp_len] = l_str[j];
33          tmp[tmp_len + 1] = '\0';
34          break;
35        }
36      }
37      tmp_len = new_len;
38    }
39    char *swap = l_str;
40    l_str = tmp;
41    tmp = swap;
42    tmp[0] = '\0';
43    
44    // Swap the lengths too
45    lstr_len = tmp_len;
46    tmp_len = 0;
47    printf("%lu: (%lu) \n", i, lstr_len);
48  }
49  free(tmp);
50  free(l_str);
51  return 0;
52 }

Thoughts

It's worth calling out that the JavaScript implementation of the above that I described in the L-Systems in C vs JavaScript post has comparable performance to the third pass above. While I won't claim my clumsy foray into C is close to being as optimised as possible, it's a testament to the engineering behind modern JavaScript engines that we can get the same performance (with arguably much higher readability) out of under 10 lines of JS code that we get with four times as much C code.

In the next post, we'll tackle some of that readability by adding some abstractions around our string handling.

First Pass: Stack-Allocated strings

Second Pass: malloc, realloc and free

Third Pass: No More strlen

Thoughts

Second Pass: `malloc`, `realloc` and `free`

Third Pass: No More `strlen`