📜 Development Environment Configuration for Scientists

January 5, 2017  

If you are hoping to improve your research by improving your computational literacy, it’s absolutely imperative that your computer is set up correctly for the job. In this guide, I am going to run through how to set up your computing environment for programming and data science. In much the same way that you install your reference managers and cloud solutions to make your computer the best writing devices; here, we will be creating your programming toolkit with the essentials for programming.

Before we jump into the nitty gritty, it’s important to be aware the limitations of this guide (and any one like it).

Scope

The concentration of this guide will be on Python and R development (though I do cover more than these).

One of my main metrics that I built this environment against was slickness. Using the terminal and programming is much easier when your computing environment feels slick. Moreover, I believe that poor display environments actually inhibit productivity. I’m not entirely sure why this is the case - maybe a line for future research - but I it seems an intuitive fact to me.

Another important caveat to note is much of guide is written from the perspective of a Linux user. While this guide is certainly going to be useful to Windows users, you’re life is going to be made a lot easier if you are using a Mac or Linux based environment.

This guide will cover the basics of setting up your home directory, choosing an editor, and how to make your environment feel lush!

Choosing an Operating System

I know, I know, if you are reading this guide, you’ve probably already been set in your ways on which kind of operating system to use. I’m not going to continue beating this debate to death - but I think it’s important to run through some basics.

In the world of operating systems, the common scientific user is likely to choose between one of three families of operating system: MacOS, Windows, and Linux. I use all of these on a regular basis - but Linux will always be my favourite. The primary advantages of using Linux or MacOS is that you get great support for the command line tools, in many cases, out of the box. Windows still has a lot of growing to do before it can compete with these Unix based operating systems. The primary advantage of Linux distributions, such as Ubuntu, is that you get a package manager - a tool which automatically installs and configures software for you.

If you want to get a really comfortable data science environment, it is important that you are first comfortable with your operating system. While beyond the scope of this post, I think it is well worth spending the time to work out whether a different operating system might work better for you, and whether you think you might need to spend some more time learning the one you are currently using. On a final point, I firmly believe that in order to be an effective developer and data-scientist, you really need to learn how to use a terminal environment, and there is no better operating system for this than Linux.

Choosing and Installing Editors

You are going to need to install two code editors. Writing code without an editor is like trying to write your paper in wordpad. Some people suggest using IDEs (integrated development environments) - but I do not. If you are just starting out with programming, you will not use the features of the IDE, and more than likely, you will be confused by the interfaces, options, menus and settings. Instead, you want a good text editor. I’m not going to suggest vim or Emacs but i do recommend you choose one of them at some stage.

For now you want to install the following two editors:

Fonts and AESTHETIC

  * Patched Powerline Fonts
  * Go Mono for Powerline

Terminal Environment

Getting your terminal environment to be comfy is the first best step you can do to improve your computing efficiency. Let’s face it, if you are reading this article, you’re probably not a command-line guru already and you probably are looking for an environment that’s more than bash command line. In what follows, I will get you started with the following:

In the image below, you can see my terminal environment in all its glory.

The Emulator

The first thing you need is a good terminal emulator. A terminal emulator is the program you use to actually run terminal. On a Mac computer, that usually means iTerm2, rather than your default “terminal” application. On Linux, it generally means Terminator rather than Gnome Terminal.

Personally, as a Linux user, I use two terminal emulators: Guake and Terminator. Guake is a Quake style dropdown terminal. This means you can bind a key combination (in my case ctrl - alt - ~) to display a terminal from anywhere in your working environment. In my workflow, I tend to use the instant Guake dropdown terminal for quick and easy commands - while I use Terminator as a more “always there” Terminal emulator.

For Windows users, I’d suggest downloading Cmder, which comes preinstalled with a Cygwin-like setup out of the box (i.e. useful Unix-like commands) and the engine running it is ConEmu.

That’s pretty much all these is to it, download one of these and you’ll be fine.

Choosing A Shell

When you are typing in commands to the terminal, the typical default is bash. However, this is not your only option. Zsh is the most popular alternative. It has some substantial benefits to bash, though beyond the scope of this article to cover.

Oh-my-zsh is a Framework devoted to creating an excellent out of the box terminal environment. After a fresh install, you will get a nice looking terminal, a bunch of autocomplete configurations, and a range of plugins to boost your productivity. There’s not much more to say - if you don’t have a strong preference - just install OMZsh and you’ll be ready to go.

Beauty is in the eye of the beholder. That said, I have a minimalist perspective. If you have installed OMzsh, you can set your theme to one of many that come pre-installed.

Configuration

There are thousands of configuration guides out there for

Useful Links

There are thousands of resources out there that go well beyond the scope of my contributions here. Below I’ve tried to just place some of the essentials I’ve found along the way as well as resources I’ve come across in writing this post: