11/29/2017                                                https://zoo.cs.yale.edu/classes/cs323/current/f17h5.
v0
                              P R E L I M I N A R Y               S P E C I F I C A T I O N
                                                                      Due 2:00 AM, Friday, 8 December 2017
  CPSC 323            Homework #5               The Shell Game: Sister Sue Saw B-Shells ...
  REMINDERS: Do not under any circumstances copy another person's code or give
  a copy of your code to another person. After discussing the assignment with
  another person (such discussions should be noted in your log file), do not
  take any written or electronic record away and engage in a full hour of mind-
  numbing activity before you work on it again. Sharing with another person ANY
  written or electronic document related to the course (e.g., code or test cases)
  is a violation of this policy.
  Since code reuse is an important part of programming, you may incorporate
  published code (e.g., from textbooks or the Net) in your programs, provided
  that you give proper attribution in your source and in your statement of major
  difficulties AND THAT THE BULK OF THE CODE SUBMITTED IS YOUR OWN.
  (60 points) Bash is a simple shell, a baby brother of the Bourne-again shell
  bash, and offers a limited subset of bash's functionality:
  - execution of simple commands with zero or more arguments
  - definition of local environment variables (NAME=VALUE)
  - redirection of the standard input (<, <<)
  - redirection of the standard output (>, >>)
  - execution of pipelines (sequences of simple commands or subcommands separated
    by the pipeline operator |)
  - execution of conditional commands (sequences of pipelines separated by the
    command operators && and ||)
  - execution of sequences of conditional commands separated or terminated by the
    command terminators ; and &
  - backgrounded commands (&)
  - subcommands (commands enclosed in parentheses)
  - directory manipulation:
         cd dirName
         cd                               (equivalent to "cd $HOME"; HOME is environment variable)
         pushd dirName                    (other forms are not implemented)
         popd                             (other forms are not implemented)
  - reporting the status of the last simple command, pipeline, or subcommand
    executed by setting the environment variable $? to its "printed" value
    (e.g., the string "0" if the value is zero).
  Once the command line has been parsed, the exact semantics of Bash are those
  of bash, except for the status variable and the items noted below.
  The assignment is to write the function process() called from Hwk5/mainBash.c.
  Thus you should use (that is, link with)
  * Hwk5/mainBash.o as the main program (source is Hwk5/mainBash.c)
  * Hwk2/tokenize.o to lex commands into token lists (interface in Hwk2/parse.h)
https://zoo.cs.yale.edu/classes/cs323/current/f17h5.v0                                                             1/5
11/29/2017                                               https://zoo.cs.yale.edu/classes/cs323/current/f17h5.v0
  * Hwk5/parse.o to parse token lists into syntactically correct trees of CMD
    structures (interface in Hwk2/parse.h).
  DO NOT MODIFY Hwk5/mainBash.c or or Hwk2/parse.h---the source code for
  process() should be in a different file (or files). To enforce this the test
  script may delete/rename local files named mainBash.* or parse.* before trying
  to make your program.
  Use the submit command to turn in your log file and the source files for
  Bash (including a Makefile, but not mainBash.* or parse.*) as assignment 5.
  YOU MUST SUBMIT YOUR FILES (INCLUDING THE LOG FILE) AT THE END OF ANY SESSION
  WHERE YOU HAVE SPENT AT LEAST ONE HOUR WRITING OR DEBUGGING CODE, AND AT LEAST
  ONCE EVERY HOUR DURING LONGER SESSIONS. (All submissions are retained.)
  Notes
  ~~~~~
  1. [Matthew & Stones, Chapter 2] contains a more complete description of bash,
     including environment variables, the various I/O redirection operators,
     pipelines, command operators, command terminators, and subcommands; and
     "man bash" and "info bash" contain more information. But bear in mind
     that there are many features that Bash does not implement. Moreover, the
     behavior of Bash may not match bash in some cases, including (this list will
     expand as we learn of discrepancies):
        a. bash has both shell variables and environment variables. A command like
             % NAME=VALUE
           assigns VALUE to the shell variable NAME, and thereafter sequences like
           $NAME are expanded to VALUE as commands are parsed; while a command like
             % NAME=VALUE printenv NAME
           assigns VALUE to the environment variable NAME in the process that is
           executing printenv (but NAME is not defined in the parent shell). Bash
           only supports the latter construct.
        b. bash allows multiple input, output, and error redirections, with the last
           encountered taking precedence. In Bash the parse() function issues an
           error message instead.
        c. In bash local variable definitions and redirection to/from a file may
           appear only after a subcommand, not before.
        d. In bash $? is a shell variable rather than an environment variable, and
           its value may differ from the status that is reported by Bash.
        e. bash and Bash report background commands and reaped zombies differently.
        f. In bash the pipefail option is not the default.
        g. In Bash the directory stack used by cd, pushd, and popd contains the
           absolute pathnames returned by getcwd() or get_current_dir_name(). In
           bash these pathnames are "massaged". For example, if you cd to /c/cs323
           the top directory name on the stack is /home/classes/cs323, not /c/cs323.
  2. An EOF (CTRL-D in column 1) makes Bash exit since getline() returns NULL.
  3. While executing a simple command, subcommand, or pipeline, Bash waits
     until it terminates, unless it has been backgrounded. Bash ignores SIGINT
     interrupts while waiting, but child processes (other than subshells) do
     not so that they can be killed by a CTRL-C. Hint: Do not implement this
     feature until everything else seems to be working.
  4. Bash uses perror() (see "man perror") to report errors from system calls.
     It may ignore error returns from close(), dup(), dup2(), setenv(), wait(),
https://zoo.cs.yale.edu/classes/cs323/current/f17h5.v0                                                            2/5
11/29/2017                                               https://zoo.cs.yale.edu/classes/cs323/current/f17h5.v0
        waitpid(), getcwd(), and get_current_dir_name(), but not from chdir(),
        execvp(), fork(), open(), and pipe().
        Bash also reports an error if the number of arguments to a built-in command
        is incorrect or if an error is detected during execution of that command.
        Execution of the command is skipped.
        All error messages are written to stderr and are one line long.
  5. For simplicity, process() may ignore the possibility of error returns from
     malloc() and realloc(). However, all storage that it allocates must still
     be reachable after it returns to main().
  6. The easiest way to implement subcommands (and possibly pipelines as well) is
     to use a subshell (i.e., a child of the shell process that is also running
     Bash) to execute the subcommand and exit with its status before returning to
     main().
  7. If you use getenv() or setenv() to get or set environment variables, you
     must first #define _GNU_SOURCE since they are not part of the ANSI Standard.
  8. You may find mkstemp() or tmpfile() useful when implementing HERE documents.
     Note: Deleting an open file does not close the file descriptor (see "man 2
     unlink").
  9. Hwk5/mainBash.c contains functions that you may find useful for debugging:
     * dumpList() dumps a token list
     * dumpTree() dumps a parse tree of CMD structures
     If the environment variable DUMP_LIST (DUMP_TREE) exists, then Bash
     dumps the token list using dumpList() (the parse tree using dumpTree()) .
  A. Hwk5/process-stub.h contains the #include statements, the STATUS() macro,
     and the function prototype for process() from my solution.
  B. No, you may not use system() or /bin/*sh.
  Fine Points
  ~~~~~~~~~~~
  1. For a simple command, the status is either the status of the program
     executed (*) or the global variable errno (if some system call failed
     while setting up to execute the program).
             (*) This status is normally the value WEXITSTATUS(status), where the
             variable status contains the value returned by the call to waitpid()
             that reported the death of the process. However, for processes that are
             killed (that is, for which WIFEXITED(status) is false), that value may be
             zero. Thus you should use the macro
               #define STATUS(x) (WIFEXITED(x) ? WEXITSTATUS(x) : 128+WTERMSIG(x))
             instead (see Hwk5/process-stub.h).
        For a pipeline, the status is that of the latest (that is, rightmost) stage
        to fail, or 0 if the status of every stage is true. (This is the behavior
        of bash with the pipefail option enabled.)
        For a subcommand, the status is that of the last simple command, pipeline,
        or subcommand to be executed as part of that subcommand.
        For a backgrounded command, the status in the invoking shell is 0.
        For a built-in command, the status is 0 if successful, the value of errno if
        a system call failed, and 1 otherwise (e.g., when the number of arguments is
        incorrect).
https://zoo.cs.yale.edu/classes/cs323/current/f17h5.v0                                                            3/5
11/29/2017                                               https://zoo.cs.yale.edu/classes/cs323/current/f17h5.v0
        Note that this status may differ from that reported by bash.                                              The command
          % /c/cs323/Hwk5/Tests/exit N
        will exit with the status N.
  2. In bash the status $? is an internal shell variable. However, since Bash
     does not have such variables, it has no means (other than a HERE document)
     to check its value. Thus in Bash the status is an environment variable,
     which can be checked using /usr/bin/printenv (that is, "printenv ?").
  3. The command separators && and || have the same precedence, lower than |, but
     higher than ; or &.
        && causes the command following (a simple command, subcommand, or pipeline)
        to be skipped if the current command exits with a nonzero status (= FALSE,
        the opposite of C). The status of the skipped command is that of the
        current command.
        || causes the command following to be skipped if the current command exits
        with a zero status (= TRUE, the opposite of C). The status of the skipped
        command is that of the current command.
        Since && and || have equal precedence, in the command
          (1)$ A || B && C
        A is always executed; if the status of A is zero, B is skipped and C is
        executed; and if the status of A is nonzero, B is executed and, if its
        status is zero, C is executed. Note that this is not what the CMD tree
        for this command might suggest.
  4. Anything written to stdout by a built-in command is redirectable.
        When a built-in command fails, Bash continues to execute commands.
        When a built-in command is invoked within a pipeline, is backgrounded, or
        appears in a subcommand, that command has no effect on the parent shell.
        For example, the commands
          (1)$ cd /c/cs323 | ls
        and
          (2)$ ls & cd .. & ls
        do not work as you might otherwise expect.
  5. When a redirection fails, Bash does not execute the command. The status of
     the command (or pipeline stage) is the errno of the system call that failed.
  6. When Bash runs a command in the background, it writes the process id to
     stderr using the format "Backgrounded: %d\n".
        Bash reaps zombies periodically (that is, at least once during each call to
        process()) to avoid running out of processes. When doing so, it writes the
        process id and status to stderr using the format "Completed: %d (%d)\n".
  7. To make programming Bash more challenging, Bash may not use waitpid() or any
     other system call that specifies the pid of the process whose death it is
     awaiting. That is, it may only use wait() and waitpid(-1,...). Unlike the
     usual tests, the test of this constraint will deduct 4 points from the total
     score if it detects a violation.
  8. gdb can follow child processes.                      See the gdb manual (link on the class web
     page) for details.
  9. As noted in the man page for perror() ("man 3 perror"):
             Note that errno is undefined after a successful system call or library
             function call: this call may well change this variable, even though it
             succeeds, for example because it internally used some other library
https://zoo.cs.yale.edu/classes/cs323/current/f17h5.v0                                                                          4/5
11/29/2017                                               https://zoo.cs.yale.edu/classes/cs323/current/f17h5.v0
             function that failed. Thus, if a failing call is not immediately followed
             by a call to perror(), the value of errno should be saved.
  Limitations
  ~~~~~~~~~~~
  The following features will be worth at most the number of points shown:
   * (20 points) pipelines
   * (12 points) &&, ||, and &
   * (12 points) subcommands
   * (12 points) the cd built-in
   * (12 points) the status variable $?
   * (10 points) HERE documents
   * ( 6 points) Reaping zombies
  Here "at most" signals a crude upper bound intended to give more flexibility
  while developing the test script and to allow interactions among features.
                                                                                                             CS-323-11/11/17
https://zoo.cs.yale.edu/classes/cs323/current/f17h5.v0                                                                         5/5