Skip to content

Latest commit

 

History

History
292 lines (213 loc) · 11.4 KB

README.md

File metadata and controls

292 lines (213 loc) · 11.4 KB

Python Clinic

Resources linked to the Lyell Centre and Keyworth Python clinics

Handy links

Python distributions

Tutorials

  • The Python Tutorial: Sections 1 - 5 are essential for getting started. Sections 10 and 11 are always worth revisiting even by experienced developers.
  • Scipy Lecture Notes: Section 1 is an excellent start for using Python for data. Download as PDF and work away online.

Demos

  • GeoPandas Demo: Demonstrates Jupyter notebook, Pandas data frames and automation of GIS analysis.

Misc

Useful libraries

Standard library

  • datetime - Tools for dealing with date and time data
  • logging - Log what's going on within your code
  • pathlib - Object-oriented utilities to deal the files and directories
  • traceback - Utilities for handling error message stack traces
  • subprocess - amongst other things, allows access to command line utilities

Science

  • Numpy - Adds arrays for numerical work (turns Python into Matlab)
  • Pandas - Adds dataframes for tabular data and time series (turns Python into R)
  • Matplotlib - Make publication-quality plots

Spatial

  • GeoPandas - Adds geodataframes for GIS-style work
  • pyproj - coordinate system transformations (wrapper to Proj4)
  • fiona - read / write GIS vector data (wrapper to OGR)
  • shapely - geometric operations for GIS vector data (wrapper to GEOS)
  • cartopy - plot spatial data in different map projections (similar to GMT)
  • iris - read / write / plot 4D gridded data (NetCDF, GRIB etc)
  • arcpy - ESRI specific functions to handle and take advantage of ESRI constructs

Databases

  • SQLAlchemy - Deal with databases as Python objects in backend-agnostic way
  • sqlacodegen - Automatically generate models from existing database
  • eralchemy - Automatically generate entity-relation diagrams from existing database

Machine learning / Artificial intelligence

  • scikit-learn - various algorithms for implementing a range of ML/AI techniques

Image processing

Web services

  • falcon - lightweight web framework for creating HTTP APIs

Past meetings

Keyworth

Date Attendance Notes
2018-12-04 20+ Overview of interest levels
2019-01-15 12 Anaconda
2019-01-29 10 Pandas (Jupyter notebook)
2019-02-19 8 Getting started with numpy (Jupyter notebook)
2019-03-07 12 Time series data compilation with Earth Observation data
2019-03-21 5 Intro to matplotlib (Jupyter notebook)
2019-05-02 10 BGS Oracle data access (example scripts)
2019-05-16 7 Python and the BGS HPC
2019-07-09 9+2 GeoPython 2019 conference summary
2019-09-27 4 Pandas for XML querying
2019-10-08 6 R for geospatial
2019-10-29 7 HPC guides and recipes
2019-11-20 7 Version control: Gitlab

Lyell Centre

Date Informatics Scientists Heriot Watt Notes
2018-08-28 5 1 0 Overview of interest levels
2018-09-11 3 0 0 Interactive plots with Plotly and Bokeh
2018-09-25 7 5 0 Good split into multiple groups
2018-10-10 4 6 0 Debugger and logging levels
2018-10-23 2 7 0 Easily convert coordinate systems with Pyproj
2018-11-07 4 4 0 Exception handling
2018-11-20 5 4 0 Virtual environments
2018-12-05 5 4 0 Testing with pytest and downloading PDFs
2018-12-17 4 0 0 Advent of Code dojo
2019-01-15 5 0 2 Datetime, file variable, installation
2019-02-13 ? ? ?
2019-02-26 4 0 1 String methods, find replace and regular expressions
2019-03-13 3 1 2 Looping Notebook provided
2019-03-26 5 1 2 Splitting time series files hawaii_co2
2019-04-10 2 1 1 Chatted with Romesh about weathering in borehole records, decided it may be a ML problem. Advised HW person about GNU Octave as a post-student MATLAB alternative
2019-04-30 5 0 1 Reproducing official plot in hawaii_co2
2019-07-05 3 0 0 Flask database table viewer webapp details
2019-10-01 5 2 0 Accessing dictionary keys as attributes
details
Themed tutorial hiatus
2024-05-21 5 1 0 Tutorials Return! Debugging in VSCode and pdb
2024-06-04 4 3 0 Objects and classes quick demo. Belfast office Zoom-ed in

Notes

Working with Jupyter notebooks

A walk through using Anaconda is provided here

2018-10-23 Pyproj code

Cleaned up IPython code history from coordinate conversion problem:

import pyproj

def to_alaska5(x, y):
    """
    Takes input in Alaska 4 (values in FEET!!!) and converts to Alaska 5.
    """
    FEET_TO_METRES = 0.3048006
    alaska4 = pyproj.Proj('+init=EPSG:26734')
    alaska5 = pyproj.Proj('+init=EPSG:26705')
    return pyproj.transform(alaska4, alaska5, x*FEET_TO_METRES, y*FEET_TO_METRES)

to_alaska5(209844.16, 2233473.9)

2018-11-07 Exception handling

Cleaned up code to demonstrate catching and handling of exceptions.

import logging
from traceback import TracebackException

# Custom exceptions can have helpful names and may perform extra
# tasks such as logging or raising an error dialog in a GUI application.
class HelpfulZeroException(Exception):
    """A custom exception that writes a log entry when called."""
    logging.exception('Helpful Zero exception raised')

# This code demonstrates an error when you have a file open
try:
    f = open('test.txt', 'w')
    f.write('hello again\n')
    1/0  # BOOM!!!
    f.write('and again')  # this line will never be called
    
except ZeroDivisionError as err:
    # Catch the error and extract useful stack trace information
    tb_exception = TracebackException.from_exception(err)
    bad_line = tb_exception.stack[-1].lineno
    
    # Create a new error with more helpful information
    # The program will terminate here
    msg = f"Tried to divide by zero on line {bad_line}"
    raise HelpfulZeroException(msg) from err

finally:
    # This line is always called, whether an error was raised or not
    # In this case it makes sure that the file is always closed.
    f.close()

logging.debug('Successfully wrote lines to file')

Note that this a contrived example and that it is best to use context managers to make sure that files are closed when you are finished with them.

2018-12-18 Advent of code

The results of the first puzzle (as written tests-first) is in the advent_of_code directory.

2019-01-15 Datetime overview

The following code was used as a demo of modules, classes and timedelta.

import datetime as dt

def my_function():
    print('hello from my function in {}'.format(__file__))
    
class MyClass(object):
    def __init__(self, name, date_of_birth):
        self.name = name
        self.date_of_birth = dt.datetime.strptime(date_of_birth, '%Y-%m-%d')
    
    def show_info(self):
        print('{} was born on {}'.format(self.name, self.date_of_birth))
    
    def show_age(self):
        age = dt.datetime.now() - self.date_of_birth
        age_years = age.days / 365
        print('{} is {:.1f} years old'.format(self.name, age_years))

2019-02-26 String methods, find replace and regular expressions

All string objects have useful methods built-in.

message = "Hello world"
type(message)
dir(message)
message.upper()
message.lower()
message_caps = message.upper()
message.split()
message.startswith('H')
message.lower().startswith('h')

Simple find and replace:

message.find('World')
message.replace('world', 'Charlie')

Regular expressions can match text patterns, but can be tricky to use. Pythex website helps test them. Regular expressions can also do search and replace.

Links:

Contact data example:

contacts = """A Geologist, 0131 650 0260, [email protected], EH14 4AP
S Developer, 0131 650 5432, [email protected], M1 1AA
T ypoMess, 01316506666, [email protected], sw1x 4qq"""
print(contacts)

Try to match phone numbers, email addresses and postcodes.

Pythex examples:

Python returns matched data as groups:

import re
match = re.search(r'(\d{4} ?\d{3} ?\d{4})')
match.groups()

Regular expressions can be used for find and replace, e.g. replace all phone numbers with the switchboard.

re.sub(r'\d{4} ?\d{3} ?\d{4}', '0131 650 1000', contacts)

2019-04-03: Hawaii plotting

Attempt to reproduce official Hawaii CO2 plot with Pandas and Matplotlib. See partial solution at ./edinburgh_materials/hawaii-plot.py.