First Steps with Python 3

Here I am taking my first steps with Python 3. Although I learnt a little a year or so ago, it’s nearly all fallen out of my head, and I need to take a more systematic approach before I can really use it for anything.

My idea so far is that I will practice Python 3 and the Pandas library up until the point I am able to do everything that I teach my first-year students to do with R. This might involve attempting tasks that are easy in R but monstrously difficult in Python, but that’s all part of the learning process.

The first step was figuring out how to get Python code chunks on this website. This website built using R and specifically designed so it is easy to show off R code. I decided to use Jupyter notebooks and then convert them into Markdown. If there’s an easier way, I sure can’t think of it. This first post will be partly just a test of whether this works at all.

print("Hello World")
Hello World

If it’s working, we should be able to see some output above. The next thing we usually tackle in the R labs is saving and calling vectors. I’m not sure if there are such a thing as vectors in Python. I’ll try to find something similar.

x = [1,2,3,4]
print(x)
[1, 2, 3, 4]

This looks a bit like a vector. But I’ve already stumbled on a problem. I can’t perform functions like I would on it in R.

x*2
[1, 2, 3, 4, 1, 2, 3, 4]

In R this would have given me 1,4,6,8. I know another way around it, but I’m not sure if this is the easiest way.

for i in x:
    print(i*2)
2
4
6
8

In fact, I know this isn’t the best way. from reading around online it looks like I can get something that acts more like vectors in R using either Numpy or Pandas. I know it’s taboo to load a library part way through a notebook, but I’ll try it anyway.

import numpy as np
x = np.array([1,2,3,4])
x*2
array([2, 4, 6, 8])

This seems to be operating much more like a numeric vector in R. Including that I seem to be able to perform functions on the array.

np.mean(x)
2.5

This is pretty much as far as the first lab usually gets. Although I could try saving myself some time.

x = range(1,10)
x = np.array([x])
x*2
array([[ 2,  4,  6,  8, 10, 12, 14, 16, 18]])

I think I’ve run head first into zero-indexing. Ideally I wanted to end on 10. Also, I don’t know what all these brackets mean. But, it’s not far off what I wanted.

x = range(1,11)
x = np.array([x])
x*2
array([[ 2,  4,  6,  8, 10, 12, 14, 16, 18, 20]])

Saving character vectors usually comes next.

farm = ["Cow","Horse","Sheep","Chicken","Sheep"]
print(farm)
['Cow', 'Horse', 'Sheep', 'Chicken', 'Sheep']

Then I want to select a single element.

farm[0]
'Cow'

Looks good. How about counting these? What’s the equivalent of the table() function? Might have to try another library.

import pandas as pd
pd.value_counts(farm)
Sheep      2
Cow        1
Horse      1
Chicken    1
dtype: int64

This is probably enough for now. I think I need to try a more systematic approach. There’s only so far you can get by guesswork.

Dr Greg Stride
Dr Greg Stride
Researcher

My research interests include UK elections, election administration and public opinion

Related