Skip to main content Skip to navigation

Python scripting

Python is a versatile language for writing scripts, i.e. a series of commands that gets interpreted and executed on the fly. You can write Python scripts to make plots, read or write data files, reduce data, re-organise files, access resources on the network, send e-mail, etc. It is not astronomy-specific and has a large and growing user-base across many scientific disciplines.

NB I created this page in the old days before we started teaching Python in the first & second years so you probably don't need to read it in detail although it may have the odd useful trick you haven't encountered. Also bear in mind that it was written when Python2.XX was king; Python3 has a few differences and so there may be the odd bug. You should develop under Python3.

Here are links to some common tasks and features as listed on this page:

  1. Basic Python script
  2. Numerical Python
  3. Reading FITS files in Python
  4. Plotting with pgplot in Python
  5. Reading ASCII files
  6. Periodograms
  7. Finding modules
  8. Finding out more about Python

A basic Python script

Here is a script to show some features of Python:

#!/usr/bin/env python3

x = 2.
print(x, 2*x, 1/x)

# load mathematical functions
from math import *

# use a couple of them (natural log and exponential function)
print(log(x), exp(x))

# create a Python list
al = [1.,2.,3.,4.,5.]

# loop through it
for num in al:
   print('number =',num)

# define a function
def doubleit(x):
	return 2*x

# use it
y = doubleit(x)

print('Twice',x,'=',y)

This is mostly self-explanatory, except the first line which is a standard unix construct for the first line of scripts to say what interpreter in being used. If you create this script as a file such as 'script.py', then you can make it executable using 'chmod +x script.py' and then run it as './script.py'. Even easier the first time is to invoke the interactive python interpreter 'ipython' and type the above lines in. The only other notable line is the 'from math import *' which is one way to load an extra module. See also below. Note how in Python ':' and indentation are used where C and Perl uses braces {}.

Numerical Python

Python lists are highly flexible and can contain mixed types of data, e.g. [1,2.0,'a string'] which starts with an integer, then a float, then a string. However, they are inefficient when handling images where all one wants is a single data type which is handled fast. Therefore you are very likely to want to use the numerical module 'numpy' which has a relatively compact array class, and can handle vectorised computations. e.g.

#!/usr/bin/env python3

import numpy as npy

# create a numpy array by conversion of a Python 'tuple' (between parentheses ())
arr = npy.array((1.,2.,3.,4.,5.,6.,7.,8.))

# Multiply all elements by 3
arr *= 3

# copy to another new array
narr = arr.copy()

# add first array to 1/new array
comb = arr + 1/narr

# create an array of complex numbers
real = npy.array((1.,1.,2.,3.))
imag = npy.array((-1.,2.,1.,2.))
i = complex(0.,1.)

cplx = real + i*imag

# take their exponential
ecplx = npy.exp(cplx)

Note that the above operations are carried out on all elements and especially for large arrays, can be much faster than doing so with loops inside Python. 'numpy' can handle data of arbitrary dimensions and has such features as computation of medians, averages, etc and much more besides. 'numpy' is installed here already. A particularly nice feature of numpy is shown in the next example (header code skipped for brevity, assume data is some numpy array representing an image, and that x and y are 1D arrays representing a spsectrum):

# limit data to 1000 maximum
data[data > 1000] = 1000.

# calculate mean spectrum value over the range 600 to 700 nm
print(y[(x > 600) & (x < 700)].mean())

Statements like 'data > 1000' generate boolean True/False arrays which can be used as array indices. This is very powerful and often useful.

FITS

FITS is a commonly used data format in astronomy. The module "astropy" can read FITS (and does much else besides). Here is an example of basic usage:

#!/usr/bin/env python3

from astropy.io import fits

hdulist   = fits.open('image.fits')
data      = hdulist[0].data
fhead     = hdulist[0].header      
hdulist.close()

# subtract 1 from the data (it is a numpy array)
data -= 1

# print the data range
print('Data ranges from',data.min(),'to',data.max())

Note that this script makes some assumptions about the format of the FITS file, but it can be easily adapted.

Plotting from within Python

More than one module is available for making plots. A popular choice is 'matplotlib'.

#!/usr/bin/env python3
#
# Script to plot a sinusoid

import numpy as np
import matplotlib.pyplot as plt # create a sinusoid running from -10 to +10, 200 points. x = np.linspace(-10.,10.,200) y = np.sin(x) # plot with matplotlib
plt.plot(x,y)
plt.show()
# Following line replacing plt.show() will create an equivalent PDF plot
#plt.savefig('plot.pdf')

There is extensive documentation of matplotlib on-line.

Reading ASCII files

Often, input data is available in multi-column ASCII files. Numpy has a useful command called 'loadtxt' that reads such files and transfers column data into numpy arrays. The following line will generate two numpy arrays x,y containing the first two columns of the file, ignoring any lines that start with # and use space as the column delimiter:

x,y = numpy.loadtxt(file, unpack=True, comments='#',delimiter=' ')

Another example to select specific columns:

x,y,z = numpy.loadtxt(file, unpack=True, usecols=(0,2,3))

There are other modules available that can handle ASCII files such as CSV, and you can easily write your own custom reader using generic python file I/O that read files on a line by line basis (as a string):

for line in open(filename, 'r'):
	fields = line.strip().split(' ')
	...

numpy.savetxt can save data in ASCII column format

Finding modules

In the examples above, 'math', 'pyfits' and 'numpy' are all modules in addition to those provided in 'core' Python. There are a huge number of such modules and for some new task it is worth finding out whether a suitable module exists. Examples are 'smtplib' for sending e-mail via SMTP servers and 'imaplib' to communicate with IMAP servers. The best way to find them is by googling, but the other problem is when you invoke a script, how does python know where to look for the installed modules? The answer is, it looks through a series of directories specified via an enviroment variable called PYTHONPATH. In bash for instance, the following line

export PYTHONPATH="/home/astro/phsaap/.local/lib/python3.6/site-pacakages:${PYTHONPATH}"

directs PYTHON to look through my (TRM) python directories, along with whatever are defined for the system.

Finding out more about Python

There are many on-line resources and these are recommended in preference to books as they will always be more up-to-date. Here are some links to these

  1. Main Python site. Loads of links from here.
  2. Python tutorial. A good place to start learning the language.
  3. Scientific Python including numpy tutorial and documentation and lots of useful modules.
  4. AstroPython hosts links to resources and tutorials relevant for astronomy
  5. AstroPy is work in progress and aims to consolidate popular python packages
  6. Python 4 Kids yes, that's you ...

The facility 'pydoc' is useful too as in 'pydoc numpy' or 'pydoc numpy.array'.

You should minimally aim to master: lists, tuples, and dictionaries (three basic data structures), strings, how to open, read and write files, the standard statements such as 'for', 'while, 'if' and how to define functions (very easy). As you get more advanced you may want to learn how to define your own object types or 'classes', and if you are carrying out fancy text operations (e.g. re-formatting tables, extracting information from program output), you may want to look at the 're' regular expression module. Finally the 'os' and 'sys' modules provide basic interaction with the system.