Getting set up to work

The initial lectures are in IPython notebook format. To get started, you need to clone the git repository for this course. Open up a terminal on your laptop.

Let’s create a directory (folder) in which to keep the code we’re going to work on. You’ll also keep more code in here during the core modules, and during your project work if you work on a computational project. The traditional name for a source code directory in the Linux world is src, so we type:

mkdir src

This means “make a directory called src. You can list the contents of the current directory by typing:

ls

You should see a list of files and directories (which were put there automatically when your laptop was set up), and src should be among them.

Let’s move into the src directory:

cd src

(cd stands for change directory). We can check which directory we’re currently in with:

pwd

(print working directory). You should see something like /home/dham/src. Of course you’re going to see your username, rather than dham.

What’s with all the stupid abbreviations?

The Unix shell, which is the thing you’re typing commands into, is a typing based interface. Back in the ’70s when this was all invented, that probably meant a teleprinter (basically a keyboard and a printer). Typing was slow and error-prone, so the use of very short abbreviations was common both for the commands (which you don’t get to choose) and for file and directory names, which you do.

Using the standard names which are used in Linux (and Unix more generally) makes it easier to understand the work of other people, because everyone is speaking the same language.

We’ll walk through the next section together, so when you get here, stop and stick your green post-it to your laptop.

You can f**k off right now!

The Python exercises are stored in a git repository on GitHub. I’ll be updating them and (hopefully not too often) fixing bugs as we go. Of course, being good scientists, you’ll want to commit the changes you make as you go along as well. This gives you a great opportunity to practice your newly acquired git skills.

Those of you with a smutty mind may have thought that the title is a bit rude. What’s actually going on is that you’re each going to need your own copy of the GitHub repository to push your own changes to. This sort of copy is called a fork.

You might choose to hold the control key while clicking the next two links so that they open in a new tab and this page stays open.

  1. Head off to GitHub and make sure you’re logged in with the account you created.
  2. Navigate to https://github.com/mpecdt/Introduction-to-Python-programming-for-MPECDT and click on the “Fork” button at the top right.

Now you’ve got your own fork of the Python course, and you can see that because the title of repository at the top of your screen will now be: username/Introduction-to-Python-programming-for-MPECDT, but with your GitHub username first.

Obtaining a local clone

Now we’re back in the territory of what we’ve already learned about GitHub. You need to go to your fork page. On the lower right is a box with the clone URL. Make sure you have the https URL - if not then click on the HTTPS link (if you already set up SSH keys on GitHub then you can use SSH instead). Click on the little clipboard logo to copy the URL.

Next, in the terminal type the following, with a space after it, but don’t press return:

git clone

Then paste the URL you just copied by pressing command + v (⌘ + v) (clicking with the middle mouse button also pastes). Press return and you should see the repository being cloned. Type:

ls

and you should see Introduction-to-Python-programming-for-MPECDT.

We’re about to use the tab key to avoid a lot of inconvenient typing. Play close attention to this, because it’s a very useful trick which also works in the IPython shell.

We need to get into that directory to do some work type:

cd I

now press the tab key. The shell should autocomplete the directory name for you, so you can press return and get into the directory.

Now you’re in there, we actually want to be in the notebook subdirectory, which is where the IPython notebooks will be.

cd notebook

Just to remind yourself of which folder you’re now in, type:

pwd

Finally, some Python

That was a bit of a saga, but it’s actually just what you do when setting up a new project. Day to day working in a revision control system is much, much simpler than that.

We are now in a position to actually run the first IPython notebook and do some Python work:

ipython3 notebook Python-1.ipynb

That line is another good use for tab completion!

The IPython notebook will appear in your web browser. We’re going to do a lot of our work over there now, but come back here at the end of the session.

Python 2 and Python 3

The Python world is transitioning through a major version change. Python version 2 is now rather old (2010) and Python 3 fixed up a number of irritating problems, especially problems which affected new users. Consequently, we’ll be learning Python 3. However, due to the need to remain compatible with old scripts, the names python and ipython still refer to Python 2 on almost all computers, so we need to explicitly type python3 and ipython3.

Finishing and committing

At the end of the session, you’ll have a modified version of the notebook file. Type:

git status

and you’ll probably see something like:

On branch master
Your branch is up-to-date with 'origin/master'.
Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git checkout -- <file>..." to discard changes in working directory)

	modified:   Python-1.ipynb

Untracked files:
  (use "git add <file>..." to include in what will be committed)

	.ipynb_checkpoints/

no changes added to commit (use "git add" and/or "git commit -a")

So we’ll want to commit our changes, and push the result back to GitHub for safety. Write:

git add Python-1.ipdb
git commit -m "first day's work"
git push

It’s impossible to say this enough: commit early and commit often. Constant, small commits of everything you ever work on are essential to productive, reliable computational science. You should be committing several times per day when working on code or on a report or paper.

Pushing to the remote repository is essential to protect your data from disk failures and lost or stolen laptops: you should feel very nervous about leaving the office for the day without pushing: indeed, the safest policy is to push every time you commit.