Resources linked to the Lyell Centre and Keyworth Python clinics
- The Python Tutorial: Sections 1 - 5 are essential for getting started. Sections 10 and 11 are always worth revisiting even by experienced developers.
- Scipy Lecture Notes: Section 1 is an excellent start for using Python for data. Download as PDF and work away online.
- GeoPandas Demo: Demonstrates Jupyter notebook, Pandas data frames and automation of GIS analysis.
- BGS Teams: Discuss Python
- Feather is a data file format that allows rapid access to large datasets within R. Pandas has functions e.g. to_feather that can read/write Feather files.
- datetime - Tools for dealing with date and time data
- logging - Log what's going on within your code
- pathlib - Object-oriented utilities to deal the files and directories
- traceback - Utilities for handling error message stack traces
- subprocess - amongst other things, allows access to command line utilities
- Numpy - Adds arrays for numerical work (turns Python into Matlab)
- Pandas - Adds dataframes for tabular data and time series (turns Python into R)
- Matplotlib - Make publication-quality plots
- GeoPandas - Adds geodataframes for GIS-style work
- pyproj - coordinate system transformations (wrapper to Proj4)
- fiona - read / write GIS vector data (wrapper to OGR)
- shapely - geometric operations for GIS vector data (wrapper to GEOS)
- cartopy - plot spatial data in different map projections (similar to GMT)
- iris - read / write / plot 4D gridded data (NetCDF, GRIB etc)
- arcpy - ESRI specific functions to handle and take advantage of ESRI constructs
- SQLAlchemy - Deal with databases as Python objects in backend-agnostic way
- sqlacodegen - Automatically generate models from existing database
- eralchemy - Automatically generate entity-relation diagrams from existing database
- scikit-learn - various algorithms for implementing a range of ML/AI techniques
- scikit-image - various algorithms for image processing
- falcon - lightweight web framework for creating HTTP APIs
Date | Attendance | Notes |
---|---|---|
2018-12-04 | 20+ | Overview of interest levels |
2019-01-15 | 12 | Anaconda |
2019-01-29 | 10 | Pandas (Jupyter notebook) |
2019-02-19 | 8 | Getting started with numpy (Jupyter notebook) |
2019-03-07 | 12 | Time series data compilation with Earth Observation data |
2019-03-21 | 5 | Intro to matplotlib (Jupyter notebook) |
2019-05-02 | 10 | BGS Oracle data access (example scripts) |
2019-05-16 | 7 | Python and the BGS HPC |
2019-07-09 | 9+2 | GeoPython 2019 conference summary |
2019-09-27 | 4 | Pandas for XML querying |
2019-10-08 | 6 | R for geospatial |
2019-10-29 | 7 | HPC guides and recipes |
2019-11-20 | 7 | Version control: Gitlab |
Date | Informatics | Scientists | Heriot Watt | Notes |
---|---|---|---|---|
2018-08-28 | 5 | 1 | 0 | Overview of interest levels |
2018-09-11 | 3 | 0 | 0 | Interactive plots with Plotly and Bokeh |
2018-09-25 | 7 | 5 | 0 | Good split into multiple groups |
2018-10-10 | 4 | 6 | 0 | Debugger and logging levels |
2018-10-23 | 2 | 7 | 0 | Easily convert coordinate systems with Pyproj |
2018-11-07 | 4 | 4 | 0 | Exception handling |
2018-11-20 | 5 | 4 | 0 | Virtual environments |
2018-12-05 | 5 | 4 | 0 | Testing with pytest and downloading PDFs |
2018-12-17 | 4 | 0 | 0 | Advent of Code dojo |
2019-01-15 | 5 | 0 | 2 | Datetime, file variable, installation |
2019-02-13 | ? | ? | ? | |
2019-02-26 | 4 | 0 | 1 | String methods, find replace and regular expressions |
2019-03-13 | 3 | 1 | 2 | Looping Notebook provided |
2019-03-26 | 5 | 1 | 2 | Splitting time series files hawaii_co2 |
2019-04-10 | 2 | 1 | 1 | Chatted with Romesh about weathering in borehole records, decided it may be a ML problem. Advised HW person about GNU Octave as a post-student MATLAB alternative |
2019-04-30 | 5 | 0 | 1 | Reproducing official plot in hawaii_co2 |
2019-07-05 | 3 | 0 | 0 | Flask database table viewer webapp details |
2019-10-01 | 5 | 2 | 0 | Accessing dictionary keys as attributes |
details | ||||
Themed tutorial hiatus | ||||
2024-05-21 | 5 | 1 | 0 | Tutorials Return! Debugging in VSCode and pdb |
2024-06-04 | 4 | 3 | 0 | Objects and classes quick demo. Belfast office Zoom-ed in |
A walk through using Anaconda is provided here
Cleaned up IPython code history from coordinate conversion problem:
import pyproj
def to_alaska5(x, y):
"""
Takes input in Alaska 4 (values in FEET!!!) and converts to Alaska 5.
"""
FEET_TO_METRES = 0.3048006
alaska4 = pyproj.Proj('+init=EPSG:26734')
alaska5 = pyproj.Proj('+init=EPSG:26705')
return pyproj.transform(alaska4, alaska5, x*FEET_TO_METRES, y*FEET_TO_METRES)
to_alaska5(209844.16, 2233473.9)
Cleaned up code to demonstrate catching and handling of exceptions.
import logging
from traceback import TracebackException
# Custom exceptions can have helpful names and may perform extra
# tasks such as logging or raising an error dialog in a GUI application.
class HelpfulZeroException(Exception):
"""A custom exception that writes a log entry when called."""
logging.exception('Helpful Zero exception raised')
# This code demonstrates an error when you have a file open
try:
f = open('test.txt', 'w')
f.write('hello again\n')
1/0 # BOOM!!!
f.write('and again') # this line will never be called
except ZeroDivisionError as err:
# Catch the error and extract useful stack trace information
tb_exception = TracebackException.from_exception(err)
bad_line = tb_exception.stack[-1].lineno
# Create a new error with more helpful information
# The program will terminate here
msg = f"Tried to divide by zero on line {bad_line}"
raise HelpfulZeroException(msg) from err
finally:
# This line is always called, whether an error was raised or not
# In this case it makes sure that the file is always closed.
f.close()
logging.debug('Successfully wrote lines to file')
Note that this a contrived example and that it is best to use context managers to make sure that files are closed when you are finished with them.
The results of the first puzzle (as written tests-first) is in the advent_of_code directory.
The following code was used as a demo of modules, classes and timedelta
.
import datetime as dt
def my_function():
print('hello from my function in {}'.format(__file__))
class MyClass(object):
def __init__(self, name, date_of_birth):
self.name = name
self.date_of_birth = dt.datetime.strptime(date_of_birth, '%Y-%m-%d')
def show_info(self):
print('{} was born on {}'.format(self.name, self.date_of_birth))
def show_age(self):
age = dt.datetime.now() - self.date_of_birth
age_years = age.days / 365
print('{} is {:.1f} years old'.format(self.name, age_years))
All string objects have useful methods built-in.
message = "Hello world"
type(message)
dir(message)
message.upper()
message.lower()
message_caps = message.upper()
message.split()
message.startswith('H')
message.lower().startswith('h')
Simple find and replace:
message.find('World')
message.replace('world', 'Charlie')
Regular expressions can match text patterns, but can be tricky to use. Pythex website helps test them. Regular expressions can also do search and replace.
Links:
Contact data example:
contacts = """A Geologist, 0131 650 0260, [email protected], EH14 4AP
S Developer, 0131 650 5432, [email protected], M1 1AA
T ypoMess, 01316506666, [email protected], sw1x 4qq"""
print(contacts)
Try to match phone numbers, email addresses and postcodes.
Pythex examples:
Python returns matched data as groups:
import re
match = re.search(r'(\d{4} ?\d{3} ?\d{4})')
match.groups()
Regular expressions can be used for find and replace, e.g. replace all phone numbers with the switchboard.
re.sub(r'\d{4} ?\d{3} ?\d{4}', '0131 650 1000', contacts)
Attempt to reproduce official Hawaii CO2 plot with Pandas and Matplotlib. See partial solution at ./edinburgh_materials/hawaii-plot.py.