CSCI 3000
Spring 2017
Programming Assignment 3

Due: Wednesday, March 22

The assignment

Implement the following in C under Linux.

Write a program that counts the number of bytes used by all regular files in a given directory, including files in its subdirectories, their subdirectories, etc. The directory name should be on the command line that starts the program. For example, if your program is called space and you want to know how many bytes are used by files in directory mydir, you would type

   space mydir

If there are two hard links to the same file, then your program should count the space twice. It should not attempt to avoid counting a multiply-linked file more than once.

Do not traverse symbolic links. If you encounter a symbolic link, just ignore it. Do not count any space for it.

Although you can ask how many blocks are allocated to a file, and compute the number of physical bytes used (included wasted space at the end of the file) do not do that. Instead, only count, for each file, the number of actual bytes in the file, ignoring wasted space on the disk.

Be careful about the directories "." and "..". Directory "." points back to the current directory, and ".." points to the parent directory. Do not go up to the parent directory!


Testing your program

Do not turn in an untested program. To test it, create a test directory and put a few files in it. Be sure to include some subdirectories and symbolic links as well. You will want to be able to create links. Command

  ln old new
creates a file called new that is hard-linked to existing file old. You cannot do a hard-link to a directory, or across file systems. Command
  ln -s old new
creates a symbolic link called new, referring to existing file or directory old. You can do a symbolic link across file systems. That is, you can refer to a file that resides in a different file system or different device from the link.

To check a symbolic link, use command

  ls -l lnk
where lnk is the name of the link. You should see the file that it is linked to.


Tools

Reading a directory

You will need to include header files <sys/types.h> and <dirent.h> to use the types and functions described here. See the manual pages for readdir and opendir for more detail.

You can get the files in a directory using the opendir and readdir calls. If variable dirp has type DIR*, and dirname is a null-terminated string (type char* or const char*) then you can open directory dirname using

    dirp = opendir(dirname);
Opendir returns NULL if it cannot open the directory.

Declare a variable of type struct dirent*. A call to readdir(dirp) gets an entry from open directory dir, and returns a pointer to a structure of type struct dirent. So if you write

    struct dirent* dp;
    ...
    dp = readdir(dirp);
then dp is set to point to a structure that contains information about one entry in the directory. The dirent structure has fields providing information about the entry. One of them is the d_name field. To get the name of the file, use dp->d_name.

Each time you call readdir, you get a description of the next thing in the directory. After you have read all of the entries in a directory, readdir returns NULL. At that point, you should close the open directory, using

    closedir(dirp).

Getting file attributes

To use the types and functions describe here, you will need to include header files <sys/stat.h>, <sys/types.h> and <unistd.h>.

The lstat system call provides status information about a file. Create a variable (say, statusInfo) of type struct stat and another (say, statResult) of type int. Then, to find out information about a file called filename, do

   statResult = lstat(filename, &statusInfo);
If the file exists, lstat returns 0, and puts information about the file into the statusinfo structure. If the file does not exist lstat returns -1. See the manual page for lstat for a complete description of the available information. Here are some of the pieces of information that you can get from statusInfo.
  1. S_ISDIR(statusInfo.st_mode) is nonzero if the file is a directory.
  2. S_ISLNK(statusInfo.st_mode) is nonzero if the file is a symbolic link.
  3. statusInfo.st_size if the file size, in bytes.

Remark. There is a related function called stat. The difference between stat and lstat is that, when the file is a symbolic link, stat returns information about the file that the link refers to, while lstat returns information about the link itself. For non-symoblic-links, stat and lstat work the same way.

Getting the command line

The command line is provided as an array of strings (type char*). If you use main program heading

  int main(int argc, char** argv)
then argc is the number of parts in the command line (including the command name) and argv is an array of the parts. For example, if you use command line
   space mydir
then argc is 2 and argv contains two strings, "space" and "mydir".

Asking questions and submitting

To ask a question, submit your current work using the following command.

~abrahamsonk/3000/bin/submit q3 files

Then send me an email stating your questions about your program.

Submitting your work

Log into xlogin.cs.ecu.edu. Make sure your current working directory contains your program. Do the following command.

~abrahamsonk/3000/bin/submit 3 space.c