Remote interactive scripting

Fredrik Alströmer 1:43 pm on May 24, 2011 Counting comments...

When I started preparing this post just over a week ago, I figured I’d have a lot of time to get everything worked out. Yet, here I am in an airplane, dodging Icelandic ash clouds in the Scandinavian sky. I guess it’s just fitting that I left everything I prepared, tucked away safely on my laptop, which is sitting neatly on my desk, at home..

Introduction

I recently worked on a project where an application was equipped with only a very basic command line interface, with no standard interactive shell or otherwise scripting capabilities. With time I grew frustrated with this setup and started experimenting with the option of using Bash (Bourne Again Shell) and remotely control the application. And by experimenting, I of course mean several failed attempts, fortunately, it was followed by a successful one which surpassed my initial expectations. Note that this is primarily aimed at the Unix/Linux environments, some things below are even GNU specific, so if you’re running Windows you could try Cygwin but you will probably still need some modifications.

The concept presented here allows you to use an off the shelf shell (Bash if you want to use the provided code), and with very little work attach it onto more or less any simple command line interface in the back end. It could definitely spare you the effort of implementing elaborate scripting UIs for your application or game, and will probably be interesting for quality assurance and testing as well. I grew very fond of this solution, and I’ll most likely use it again.

Why would you want to do this? The obvious features which Bash provides is scripting, functions, and control flow constructs such as if-statements, loops and switch/case. Normally this is also available in Ye Olde Scripting language which you just integrated into your code already. You might also argue that standard shell pipelines are easily mapped onto other language constructs of your favorite language. But will it provide the opportunity to let your application dump its in memory log, pipe it through a filter program and have it pop up in your favorite (i.e. VIm, of course) editor? This is what you get when you have your local tools integrated with the target platform. What Bash does provide which is rarely available otherwise is on the interactive side of things, Bash is intended to be used interactively as well as scripted, which becomes obvious first and foremost through the GNU readline integration, which means tab-completion of commands and parameters (which can be very handy), and a searchable history of previous commands. There’s also the prompt, which can be put to really good use as well.

General structure

Schematic view of the communication paths

The set up is based around a daemon process which does handles the communication, and a client application which bash invokes whenever it wants to communicate, and last but not least, a bash script which is doing a little bit of trickery to set up the environment. There’s also the back end of course, which we want to communicate with. There are three reasons for using a background daemon application for doing the actual communication, first, you save the handshake overhead, including login if you happen to have to kind of security (which the example code will not take into account) and this greatly increases the responsiveness. Second, it allows any in-between output to reach the terminal, so if you have alerts being emitted on the debug connection, you will see it, or if your command took really really long to complete, and your client assumed it timed out and closed the connection, you will still eventually at least see the output of that command. Third, it allows the communications daemon to run facing a different terminal, which will send all in-between communication to a separate terminal, this can be really useful when using, for example, alerts as mentioned.

The client is communicating with the daemon using Unix-domain sockets, it simply sends the command, shuts down the writing end of the socket, waits for incoming data and simply relays it to standard out until the remote end closes the connection, when it simply exits. This is effectively the proxy for the command being executed remotely. I’ve never had any use for piping data into commands in the back end, so I’ve only
implemented simple request-response type of communication, which has been quite sufficient for me, but I suppose it’s quite possible to extend it to pipe data via the client to daemon and onward to the back end as well.

Originally I wrote the daemon and client in Tcl, using expect, which makes it really easy to do login procedures. Tcl is actually quite a neat language which is often underrated, and I intend to return to it in future posts. However, for this time around, I wanted to brush up on my Python, so I rewrote it. Tcl also does not natively support Unix-domain sockets, and in my case, it would have been preferable to use that instead, as everyone worked on shared servers, which meant free port numbers were hard to come by and it required some tricky collision avoidance stuff.

The code

Here’s the code i wrote for this post, feel free to do whatever you want with it and adapt it to your needs. It’s by no means complete, and there are a bunch of error situations and corner cases which should be handled but aren’t. I post the entire thing if you’d like to try it out, enjoy.

The back end server

The remote application would be your android app, or your console game, or even your PC application. For demonstration purposes, I’ve implemented a very simple server in Tcl which simply outputs a prompt, reads a command, and executes it using a shell, dumping the output back over the wire.

#!/usr/bin/tclsh
 
   
 
  set quit false
 
   
 
  proc prompt {channel} {
 
    puts -nonewline $channel {$> }
 
    flush $channel
 
  }
 
  proc inbound {channel addr port} {
 
    global quit status
 
    if [catch {
 
      set command [string trim [gets $channel]]
 
      if {$command eq "exit" || [eof $channel]} {
 
        close $channel
 
      } elseif {$command eq "quit"} {
 
        puts $channel "Bye bye..."
 
        close $channel
 
        set quit true
 
      } else {
 
        if {$command eq "status"} {
 
          puts $channel "Status: [info cmdcount]"
 
        } elseif {$command eq ""} {
 
          # no command, just spit out a new prompt
 
        } else {
 
          if [catch {
 
            puts -nonewline $channel [eval "exec -keepnewline -- $command"]
 
          } msg] {
 
            puts $channel "failure: $msg"
 
          }
 
        }
 
        prompt $channel
 
      }
 
    } msg] {
 
      puts "Connection to $addr lost: $msg"
 
      close $channel
 
    }
 
  }
 
   
 
  proc server {channel addr port} {
 
    fileevent $channel readable [list inbound $channel $addr $port]
 
    puts $channel "Welcome to the dummy server!\n"
 
    prompt $channel
 
  }
 
   
 
  socket -server server 9900
 
  vwait quit

The implementation is not really important, and I provide it only for completeness, and if you just want to try it out. It might be worth considering using a simple human readable interface similar to this one though, for those emergency situations where you’ll only have your trusty telnet client to go around. Here I use a simple
fixed prompt to make it easily recognizable by the communications daemon.

The communications daemon

The communications daemon basically consists of three parts, one part doing the actual communication, one part monitoring the activity, and the last part doing the set up, connecting, forking and so on. In the original implementation, I also had a section covering the internal control mechanisms between the client and the daemon, which is only barely present here.

import getopt
 
  import os
 
  import sys
 
  import select
 
  from select import poll
 
  import socket
 
  import atexit
 
   
 
  prompt="$> "
 
   
 
  def inbound(sock,timeout = 10000):
 
    p = poll()
 
    p.register(sock.fileno(), select.POLLIN|select.POLLPRI|select.POLLERR)
 
    while True:
 
      if p.poll(timeout):
 
        data = sock.recv(4096)
 
        if data:
 
          yield data
 
        else:
 
          return
 
   
 
  def terminal(data):
 
    print data,

The beginning of the file is just a few helper functions, let’s move on.

class Backend:
 
    def __init__(self, host, port):
 
      self.sock = socket.socket()
 
      self.sock.connect((host,port))
 
   
 
    def fileno(self):
 
      return self.sock.fileno()
 
   
 
    def execute(self, line, client = None):
 
      self.sock.send(line+"\n")
 
      return self.capture(client)
 
   
 
    def passthru(self):
 
      data = self.sock.recv(4096)
 
      if data:
 
        terminal(data)
 
      else:
 
        raise IOError("Connection lost")
 
   
 
    def capture(self, client = None):
 
      buffer = ""
 
      output = None
 
      if not client:
 
        output = ""
 
      for more in inbound(self.sock):
 
        buffer = buffer + more
 
   
 
        if prompt in buffer:
 
          data, rest = buffer.split(prompt, 1)
 
          if client:
 
            client.send(data)
 
          else:
 
            output = output + data
 
          if rest:
 
            terminal(rest)
 
          return output
 
        else:
 
          # if we assume a prompt/separator is always on a single line, we can
 
          # relay the lines we've got already.
 
          lines = buffer.splitlines(True)
 
          buffer = lines.pop() # keep the last, unfinished, line in the buffer
 
          for line in lines:
 
            if client:
 
              client.send(line)
 
            else:
 
              output = output + line
 
   
 
      if buffer: # timeout..
 
        client.send(buffer)
 
      return output

Above is the back end communication part, and the only thing notable here is that we keep sending line by line back to the client, this is to avoid the problem of partial reads. Here we assume the prompt is always just on one line (i.e. the prompt itself doesn’t contain a new line), and only send complete lines back to the client (which can’t contain the prompt, or we’d have spotted it).

Moving on to the monitoring part.

def serve(srv, pidFile, exithandlers=[]):
 
    back = Backend("localhost", 9900)
 
    p = poll()
 
    p.register(srv.fileno(), select.POLLIN|select.POLLPRI|select.POLLERR)
 
    p.register(back.fileno(), select.POLLIN|select.POLLPRI|select.POLLERR)
 
   
 
    print back.capture() # consume welcome message, wait for prompt.
 
                         # do this before the fork, so we'll get a nice clean
 
                         # message before the parent process exits
 
   
 
    pid = None
 
    if pidFile:
 
      sys.stdin.close()
 
      pid = os.fork()
 
   
 
    if pid:
 
      open(pidFile,"w").write(str(pid))
 
    else:
 
      # we don't want to register these until after the fork.
 
      for handler in exithandlers:
 
        atexit.register(handler)
 
      while True:
 
        try:
 
          events = p.poll()
 
          for fd, event in events:
 
            if fd == back.fileno():
 
              # inbetween output, echo to terminal
 
              back.passthru()
 
            elif fd == srv.fileno():
 
              c, addr = srv.accept()
 
              command = ""
 
              for data in inbound(c):
 
                command += data
 
              if command[0] == "%":
 
                control = command[1:]
 
                if control == "check":
 
                  back.execute("")
 
              else:
 
                back.execute(command, c)
 
              c.close()
 
        except IOError, err:
 
          # reconnection logic might go here
 
          print str(err)
 
          break

The fork is done after connecting and consuming the welcome message. Bash will wait for the parent process to finish before issuing another prompt, which will produce a nice and neat welcome message when connecting, and there will be no fight over who gets to print to the terminal first.

After that, we sit and wait for something to happen, if the back end sends something, we’ll relay it to the terminal, and if the listening socket triggers, we’ll accept the connection and issue the commands, capturing it’s output and relaying back to the client. In this implementation, I check for an initial percent-sign, to indicate communication addressed to the daemon itself. In the original implementation, this logic was extensive, as I needed to switch the expected prompt, and sometimes count opening and closing braces, in the back end output, in order to determine when the command had actually finished.

def main():
 
    try:
 
      optlist, positional = getopt.getopt(sys.argv[1:], "P:u:")
 
    except getopt.GetoptError, err:
 
      print str(err)
 
      sys.exit(1)
 
   
 
    pidFile = None
 
    unix = None
 
    srv = None
 
   
 
    try:
 
      for opt, arg in optlist:
 
        if opt == '-P':
 
          pidFile = arg
 
        elif opt == '-u':
 
          srv = socket.socket(socket.AF_UNIX)
 
          srv.bind((arg)) # might throw, which is why we store it afterwards.
 
          unix = arg
 
   
 
      if not srv:
 
        print "No address specified"
 
        sys.exit(1)
 
   
 
      srv.listen(5)
 
      serve(srv, pidFile, [lambda y=x: os.unlink(y) for x in [pidFile, unix] if x])
 
      srv.close()
 
   
 
    except IOError, err:
 
      print str(err)
 
      if unix:
 
        os.unlink(unix)
 
      sys.exit(1)
 
    except KeyboardInterrupt, err:
 
      pass # we're done here.
 
   
 
  if __name__ == "__main__":
 
    main()

And last but not least, a block of boilerplate. Well sort of anyway, checking command line options and setting up the server socket. I use the interrupt signal to shut down the daemon, which manifests as a KeyboardInterrupt in Python.

The client proxy

The client proxy is really pretty dumb, and only does a request response.

#!/usr/bin/python
 
  import getopt
 
  import socket
 
  import sys
 
  import select
 
  from select import poll
 
   
 
  def inbound(sock,timeout = 10000):
 
    p = poll()
 
    p.register(sock.fileno(), select.POLLIN|select.POLLPRI|select.POLLERR)
 
    while True:
 
      if p.poll(timeout):
 
        data = sock.recv(4096)
 
        if data:
 
          yield data
 
        else:
 
          return
 
      else:
 
        return
 
   
 
  def main():
 
    try:
 
      optlist, positional = getopt.getopt(sys.argv[1:], "u:")
 
    except getopt.GetoptError, err:
 
      print str(err)
 
      sys.exit(1)
 
   
 
    unix=None
 
   
 
    for opt, arg in optlist:
 
      if opt == '-u':
 
        unix = arg
 
   
 
    try:
 
      sock = None
 
      if unix:
 
        sock = socket.socket(socket.AF_UNIX)
 
        sock.connect((unix))
 
      else:
 
        print "No address specified"
 
        sys.exit(1)
 
   
 
      sock.send(" ".join(positional))
 
      sock.shutdown(socket.SHUT_WR)
 
   
 
      for data in inbound(sock):
 
        print data,
 
      sock.close()
 
    except IOError, err:
 
      print str(err)
 
      sys.exit(1)
 
   
 
  if __name__ == "__main__":
 
    main()

It sends the command line parameters, separated by spaces to the daemon, and then closes the writing end of the socket, signalling to the daemon that we’re done. Then it just sits and wait for data until the daemon closes its end of the socket.

Smash

Let’s move on to the Bash script, which I call smash, because, well, err… It has two modes of operation, interactive or non-interactive. I wanted it to behave in a similar fashion to any shell, which caused me more than one headache and led me into several dead ends.

#!/bin/bash
 
   
 
  declare -a chainscript
 
  if [ ! "$PS1" ]; then
 
    suppress=false
 
    for param in "$@"; do
 
      if ! $suppress; then
 
        case $param in
 
          --) suppress=true; ;;
 
          -*) # handle smash parameters, need to be exported through
 
              # environment if executing the interactive shell
 
            ;;
 
          *) # a script file was given
 
            chainscript[${#chainscript[*]}]=$param;
 
            suppress=true;
 
            ;;
 
        esac
 
      else
 
        # all parameters belong to the script
 
        chainscript[${#chainscript[*]}]=$param
 
      fi
 
    done
 
    if [ ${#chainscript[*]} -eq 0 ]; then
 
      # no script specified, restart as an interactive shell
 
      self=$(readlink -f $(which "$0"))
 
      export SMASH_BASE=${self%/*}
 
      exec /bin/bash --init-file $self -i
 
    fi
 
  else
 
    # we're an interactive shell, let's set up the prompt to do something useful
 
    PS1='\u@\h:\w [$(online && x status || echo -)]\$ '
 
  fi

This is the first section, it first checks if Bash is being run interactively or non-interactively. When it is first started, it will initially always be non-interactively, as it’s executing the smash-script. If it finds it’s being
run non-interactively, it scans the parameters and if it finds a script to execute, it remains non-interactive and stores the script to execute until the environment has been set up. If it doesn’t find any script to execute, it restarts Bash but supplying itself as a init-file instead. Should it find it’s being run as an interactive shell, which would be the case after it restarted itself, it proceeds by setting up interactive prompt. Note that you can call commands from within the prompt, and if you have a traversable directory
structure in your back end, you could for example query the current directory in the prompt directly.

client="$SMASH_BASE/client.py"
 
  daemon="$SMASH_BASE/daemon.py"
 
   
 
  socket="$(readlink -f unix-socket)"
 
  pidfile="$(readlink -f daemon.pid)"
 
  pid=0
 
   
 
  online() { [ $pid -gt 0 ]; }
 
   
 
  x() { online && $client -u $socket "$@"; }
 
  ctl() { x %$*; }
 
   
 
  connect() {
 
    if online; then
 
      disconnect;
 
    fi
 
    if $daemon -P $pidfile -u $socket; then
 
      pid="$(<$pidfile)";
 
    fi
 
  }
 
  disconnect() { if online; then kill -INT $pid >/dev/null 2>&1; pid=0; fi; }
 
  check() { if online; then ctl check || disconnect; fi; }
 
   
 
  trap disconnect EXIT
 
   
 
  # load utility files here..
 
  [ -r ~/.smashrc ] && . ~/.smashrc
 
   
 
  PROMPT_COMMAND="check;$PROMPT_COMMAND"

The middle section is the interesting part, and this is where you set up your environment around the proxy and daemon processes. You’ll need the connect and disconnect commands, the check in the prompt command (which is executed before each prompt is issued) is useful for checking the connection to the daemon process, as it’ll be executed in the shell itself and can modify the environment. Any commands in the prompt itself (rather than the prompt command environment variable) will have its own subshell, and won’t be able to affect the surrounding environment, such as setting a disconnected-flag (yep, wandered into that one as well). I like to have a command, x, to send raw commands to the back end, and then build my other commands around that.

# footer
 
  # if a script was specified, we need to do some chain loading here
 
  if [ ${#chainscript[*]} -gt 0 ]; then
 
    # reset positional parameters (given on the commandline)
 
    set - "${chainscript[@]:1}"
 
    # and, source the script.
 
    source "${chainscript[0]}"
 
  fi

The last section is the continuation of the first, i.e. executing the script which was supplied on the command line, if any. In Bash we can reset the positional parameters, which creates an intuitive environment for anyone using smash:
smash my-script.smash 10 test cat
will run my-script.smash in the smash environment (you will probably want to start off my-script.smash with connect) with the positional parameters $1, $2, and $3 set to "10", "test", and "cat" respectively.

Moving on

When I first started experimenting with this, I only really wanted to have auto-login, and the ability to use some loops and variables, and I ended up having a lot more, and as it turns out the feature I appreciate the most is tab-completion. Bash provides the command ‘complete’ to set up completions (for more information, see the Programmable Completions section of the Bash manual). There are a few ways to supply completions, some are built-in, for completing file names or perhaps variable names, or even a fixed list of words, or you can supply a function which will generate possible completions at runtime (I hope you see where I’m going with this). Since the command line is being built locally (as opposed to when working on a telnet connection, where you’d have to cancel your command, in order to check the available options), there’s nothing stopping you from issuing commands to the query the possible options to a command, and generate the completions with appropriate sed-scripts or similar.

And for the record, I called my wife after touching down, and had her push my git repository. :)