Python Lesson #2

One of the first things you are probably going to want to do is use modules.

For example to print out pi, we use the math module as below

import math
math.pi

3.141592653589793

When we import modules they generally come with same namespace, i.e. calling pi function with prefix namespace of math. For performance, etc we can also just import the specific functions we want to use. This puts the function in the global name space. For example:

from math import pi
pi

It is also possible to use a alias – such as

import math as m
m.pi

You can also create your own modules simply by storing functions in filenames with .py extension. So for example we create a file called mymodule.py with following content

# mymodule.py
def double(number):
—>return 2 * number

Then we invoke with following .. if you get an error try import sys;print sys.path for folders searched for modules. Also you can use PYTHONPATH system variable to include other paths

import mymodle
double(65)
>> 130

R – Quickly Graph

How to quickly paste into R and get some results on a Mac (just change pipe/paste for diff OS)

df<-read.table(pipe("pbpaste"),sep=" ")
names(df)<-c("date","time","count")
df$dtg<-strptime(paste(df$date,df$time,sep=" "),"%Y-%m-%d %H:%M:%S")
plot(df$dtg,df$count,las=2)

Python Intro

I have been working with Python and spent many hours of study, but best way to learn is to teach 🙂

As per my other guides into PHP, Perl, etc .. I will gradually write some intro, intermediate and advanced topics.

As a quick warm up 🙂

$ python -c 'print("Hi World")'
Hi World

$ python -c 'print(type("test"))'
<type 'str'>

$ python -c 'print(3**7)'
2187

Quick bit of OO – defining method

class Car:      # class attrib     category = "Vehicle"      # instance attrib     def __init__(self,name,make,model):         self.name=name         self.make=make         self.model=model      # instance method     def desc(self):         return "Name: {} is made by {} and is the {} body shape".format(self.name,self.make,self.model)

Then creating an object from this class and calling methods

fordLaser=Car("Ford Laser","Ford","Hatchback")
print(fordLaser.desc())

Datascience – Ruby

Just having a play with ruby and thought I’d try to simulate summary(x):

irb(main):001:0> y=[]
irb(main):002:0> def x();rand(9999);end;
=> :x
irb(main):003:0> def summary(x=0); puts "min: #{x.min} max: #{x.max} mean: #{(x.sum(0.0)/x.size).round(2)}"; end
=> :summary
irb(main):004:0> 99.times do; y<<x;end
=> 99
irb(main):005:0> summary(y)
min: 23 max: 9851 mean: 5127.23

Git clone – gitlab docker

HowTo: clone from a gitlab server running inside a docker container

  1. Ensure SSH pubkey is setup by following https://docs.gitlab.com/ee/ssh/
  2. Check source port for SSH here:

docker inspect gitlab | jq '.[0].NetworkSettings.Ports."22/tcp"'

  1. This .ssh/config worked for me

Host git
Hostname 127.0.0.1
User git
Identityfile ~/.ssh/id_ed25519
Port 922

  1. Then clone like this:

git clone git@git:root/my-awesome-project.git

Pandas Numpy and Matplotlib

I have created demo code with Jupyter Notebook, which can be viewed here: https://github.com/marcuspaget/pythonDSFromScratch/blob/master/PandasDemo.md

Panda – Python Data Analysis Library

Quick install with:

pip install pandas

Python’s answer to R’s DataFrames for data manipulation

Providing tools to read and write data between data structures and different formats: CSV and text files, Microsoft Excel, SQL databases, and the fast HDF5 format;

Easily to manipulate, slice and dice data, with integrated indexing.

Possible to convert HDF5 to HDFS for ingestion in Hadoop

Time series-functionality:

  • Date range generation & modification
  • Frequency conversion
  • Moving window statistics
  • Join time series without losing data

Numpy – Python Number Library

pip install numpy
Create, manipulate , slice and run ops i.e std, mean, min, max, etc
Please see link at top for examples

Matplotlib – Python plotting and figures

pip install matplotlib
Graph data from lists, dataframes, etc.

For example:


Matplotlib output example

I have created demo code with Jupyter Notebook, which can be viewed here: https://github.com/marcuspaget/pythonDSFromScratch/blob/master/PandasDemo.md

R – Sum by category with tapply

Good use of R’s tapply function to summary data ..

## read a csv file into a table called x – the first row contains column names

x<-read.table("2014-tax.csv",sep=",",header=T)

## In my instance column names are Item,Amount,Cat,Month,Who
## split out by Who

bob<-x[x$Who == "bob",] jane<-x[x$Who == "jane",]

## Spin around each row (obs) and sum the Amount

print(tapplybob$Amount,bob$Cat,sum))

## Typical output for bob #

#  books  equipment licences stationery  supplies    telephone 
# 303.00 694.27 132.00 345.50 96.00 30.00
#


# Then for jane

print(tapply(jane$Amount,jane$Cat,sum)) 

# books equipment licences stationery supplies telephone 
# 163.0 583.0 348.0 678.4 11543.0 NA
#
#

Arduino Hacking

Whilst on 5 week long service leave – created my first Arduino Hack to control an LCD (Liquid Crystal Display) I had lying around for last 6 years 🙂

Please see video below and all code is here:

https://github.com/marcuspaget/ArduinoLCD

Here is the video that shows all the steps, please feel free to leave comments below or on youtube if keen to discuss.

Python Dict

Sample code for working with Python Dicts

# init users list of dicts, print out id 1, then init friends list of tuples

users = [

     { "id": 0, "name": "Bob" },
{ "id": 1, "name": "Dunn" },
{ "id": 2, "name": "Sue" },
{ "id": 3, "name": "Chi" },
{ "id": 4, "name": "Thor" },
{ "id": 5, "name": "Clive" },
{ "id": 6, "name": "Hicks" },
{ "id": 7, "name": "Devin" },
{ "id": 8, "name": "Kate" },
{ "id": 9, "name": "Klein" },
{ "id": 10, "name": "Jen" }

]

i=0
for user in users:
if(users[i]["name"]=="Bob"):
print("Bob ID: ",users[i]["id"])
i+=1

friends= [(0, 1), (0, 2), (1, 2), (1, 3), (2, 3), (3, 4), (4, 5), (5, 6), (5, 7), (6, 8), (7, 8), (8, 9)]

# spin through all users and create empty list to store list, then populate

for user in users:
user["friends"]=[]

# populate empty list with all left side of tuple with right and vice versa

for i,j in friends:
users[i]["friends"].append(users[j])
users[j]["friends"].append(users[i])

# function to return length based on passed in user

def number_of_friends(user):     
"""how many friends does _user_ have?"""
return len(user["friends"]) # length of friend_ids list

# total up all friends

total_connection = sum(number_of_friends(user)                         
for user in users) # 24

# grab number of users

num_users = len(users)

avg_connections = total_connection / num_users # 2.4

# create a list (user_id, number_of_friends)

num_friends_by_id = [(user["id"], number_of_friends(user)) for user in users] 

print(sorted(num_friends_by_id,key=lambda pair: pair[1], reverse=True))

# Output – largest to smallest

[(1, 3), (2, 3), (3, 3), (5, 3), (8, 3), (0, 2), (4, 2), (6, 2), (7, 2), (9, 1), (10, 0)]

Powershell Intro

In recent years it has become necessary to learn Powershell. This is for a number of reasons, but predominately because certain automations are only possible via it. More on that later.

First – how to run powershell … simply windows key + R and powershell.

Or Windows key and start typing powershell – select Windows PowerShell ISE. This provides an IDE.

Variables

$single_quoted_string = 'a '
$double_quoted_string = "a string – $x"
Write-Host $single_quote_string
Write-Host $double_quoted_string

# can enforce with [string]

## Integers
[int]$int_one = 1
$int_two = 2
$int_one + $int_two

## Arrays
[array]$arr = @(1,'str_var',5)
$arr += 'add to arr'
$array

## Hash tables
[hashtable]$htab = @{'key1' = 'value1'; 'key2' = 'value2'}
$htab

$htab.Get_Item('key1')
$htab.key1

$htab.Add('newkey',”new added variable”)

$htab.Set_Item(“key1”, “mod_val1”)
$htab.Remove('key1')

If/Then

$a = 1
if ($a -eq 1) {
Write-Host 'a equals 1'
} else {
Write-Host 'a is not equal to 1'
}

# also possible to elseif

} elseif ($a -eq 3) {

Switch

$a = 1

switch ($a) {
1 {“Value 1”}
2 {“Value 2”}
3 {“Value 3”}
default {“Value exceeds threshold.”}
}

$b = “X365”

switch -wildcard ($b) {
“Z*” {“Val Z”}
“Y*” {“Val Y”}
“X*” {“Val X”}
default {“Val outside parameters.”}
}