Persistent Terminal Sessions

Persistent Terminal Sessions
Photo by Philipp Katzenberger / Unsplash

It is a common practice to SSH into an EC2 instance and perform some one-off administrative tasks. Some of these could be pretty small such as changing the cron schedule or viewing logs, posing little to no challenges. However, some of the tasks could be significantly large and time-consuming, sometimes spanning over a couple of hours or even days. This is where life starts getting a bit difficult. Let's dive into the possible problems and their solutions.

Problems

For the time-intensive operations, monitoring their progress becomes very important at the very least to check if they are still alive and healthy. However, the terminal in which you ran the command may itself go unresponsive as inactivity leads to session timeout. The same may happen due to internet connection disruption. The story doesn't end here. Sometimes, it is not only the SSH session that times out. If you SSH-ed into the EC2 instance by connecting to a VPN first, the VPN session itself may time out even before the SSH session.

As soon as the SSH session ends prematurely, the underlying process also terminates. It hurts like a poison sting, especially in situations where the running process is not idempotent. Imagine this happening while:

  • executing non-idempotent long-running Python scripts on EC2
  • running multiple complex SQL scripts to provide cash bonuses to millions of users by updating a column in the respective database records
  • syncing billions of small files from an on-prem data center to S3 which, if terminated prematurely, requires hours to perform the comparison of metadata to calculate the delta and resume the syncing

Not fun!

Solutions

So, how do we avoid running into these terrible situations?

We need some facility that helps make sure that the processes we start keep running in the background even if the terminal goes unresponsive for whatever reasons. In addition, we would like to be able to check the progress of the processes we kicked off.

What we need is called a terminal multiplexer. A multiplexer allows running a terminal session in the background or multiple terminal sessions at once.

There are a couple of options:

1- screen

screen is a terminal application developed by the GNU project and is used for terminal multiplexing. It usually comes preinstalled in Linux distributions. If not, it can be easily installed.

After logging into the EC2 instance, just run screen -S <optional-session-name> to start a new session. Run the required commands in the session like they are executed normally in any terminal.

The session can be detached using Ctrl-a + d. To reattach, run screen -r <optional-session-name>

2- tmux

tmux is an open-source terminal application, that also serves as a terminal multiplexer. Just like screen, it can be installed easily if not already available in the system.

After logging into the EC2 instance, run tmux new -s <optional-session-name>

One good thing about tmux is that it uses Ctrl-b instead of Ctrl-a as the control character since in the Linux terminal, the latter does something else i.e. move the cursor to the beginning of the line.

The session can be detached using Ctrl-b + d. To reattach, run tmux attach-session -t <optional-session-name>

Concluding Remarks

Feel free to choose any tool of your choice. On the basic level, they are pretty much the same. Generally, tmux is reported as more user-friendly than screen. However, it doesn't allow sharing a session with other users whereas screen has that feature.

Both tools help make a system administrator's job easier by letting the processes run in the background. Their progress can be monitored at will. One can even close the laptop and come back at a later time, reconnect to the VPN, SSH into the EC2 instance, and reattach to the screen.

I encourage exploring the advanced features too e.g. sharing the sessions with other users, splitting the window to run multiple sessions, and arranging them according to your taste to run different commands in parallel. Vote for your favorite tool in the comments.

Happy Multiplexing!