AnswerWhat does this code do?

time to read 4 min | 764 words

I previously asked what the code below does, and mentioned that it should give interesting insight into the kind of mindset and knowledge a candidate has. Take a look at the code again:


#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <fcntl.h>
#include <errno.h>
#include <sys/stat.h>


#define BUFFER_SIZE (3ULL * 1024 * 1024 * 1024) // 3GB in bytes


int main() {
    int fd;
    char *buffer;
    struct stat st;


    buffer = (char *)malloc(BUFFER_SIZE);
    if (buffer == NULL) {
        return 1;
    }


    fd = open("large_file.bin", O_WRONLY | O_CREAT | O_TRUNC, S_IRUSR | S_IWUSR);
    if (fd == -1) {
        return 2;
    }


    if (write(fd, buffer, BUFFER_SIZE) == -1) {
        return 3;
    }


    if (fsync(fd) == -1) {
        return 4;
    }


    if (close(fd) == -1) {
        return 5;
    }


    if (stat("large_file.bin", &st) == -1) {
        return 6;
    }


    printf("File size: %.2f GB\n", (double)st.st_size / (1024 * 1024 * 1024));


    free(buffer);
    return 0;
}

This program will output: File size: 2.00 GB

And it will write 2 GB of zeros to the file:


~$ head  large_file.bin  | hexdump -C
00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
7ffff000

The question is why? And the answer is quite simple. Linux has a limitation of about 2 GB for writes to the disk. Any write call that attempts to write more than that will only write that much, and you’ll have to call the system again. This is not an error, mind. The write call is free to write less than the size of the buffer you passed to it.

Windows has the same limit, but it is honest about it

In Windows, all write calls accept a 32-bit int as the size of the buffer, so this limitation is clearly communicated in the API. Windows will also ensure that for files, a WriteFile call that completes successfully writes the entire buffer to the disk.

And why am I writing 2 GB of zeros? In the code above, I’m using malloc(), not calloc(), so I wouldn’t expect the values to be zero. Because this is a large allocation, malloc() calls the OS to provide us with the buffer directly, and the OS is contractually obligated to provide us with zeroed pages.

More posts in "Answer" series:

  1. (22 Jan 2025) What does this code do?
  2. (05 Jan 2023) what does this code print?
  3. (15 Dec 2022) What does this code print?
  4. (07 Apr 2022) Why is this code broken?
  5. (20 Jan 2017) What does this code do?
  6. (16 Aug 2011) Modifying execution approaches
  7. (30 Apr 2011) Stopping the leaks
  8. (24 Dec 2010) This code should never hit production
  9. (21 Dec 2010) Your own ThreadLocal
  10. (11 Feb 2010) Debugging a resource leak
  11. (03 Sep 2009) The lazy loaded inheritance many to one association OR/M conundrum
  12. (04 Sep 2008) Don't stop with the first DSL abstraction
  13. (12 Jun 2008) How many tests?