R – Quickly Graph

How to quickly paste into R and get some results on a Mac (just change pipe/paste for diff OS)

df<-read.table(pipe("pbpaste"),sep=" ")
names(df)<-c("date","time","count")
df$dtg<-strptime(paste(df$date,df$time,sep=" "),"%Y-%m-%d %H:%M:%S")
plot(df$dtg,df$count,las=2)

Python Intro

I have been working with Python and spent many hours of study, but best way to learn is to teach 🙂

As per my other guides into PHP, Perl, etc .. I will gradually write some intro, intermediate and advanced topics.

As a quick warm up 🙂

$ python -c 'print("Hi World")'
Hi World

$ python -c 'print(type("test"))'
<type 'str'>

$ python -c 'print(3**7)'
2187

Quick bit of OO – defining method

class Car:

    # class attrib
    category = "Vehicle"

    # instance attrib
    def __init__(self,name,make,model):
        self.name=name
        self.make=make
        self.model=model

    # instance method
    def desc(self):
        return "Name: {} is made by {} and is the {} body shape".format(self.name,self.make,self.model)

Then creating an object from this class and calling methods

fordLaser=Car("Ford Laser","Ford","Hatchback")
print(fordLaser.desc())

Datascience – Ruby

Just having a play with ruby and thought I’d try to simulate summary(x):

irb(main):001:0> y=[]
irb(main):002:0> def x();rand(9999);end;
=> :x
irb(main):003:0> def summary(x=0); puts "min: #{x.min} max: #{x.max} mean: #{(x.sum(0.0)/x.size).round(2)}"; end
=> :summary
irb(main):004:0> 99.times do; y<<x;end
=> 99
irb(main):005:0> summary(y)
min: 23 max: 9851 mean: 5127.23

Git clone – gitlab docker

HowTo: clone from a gitlab server running inside a docker container

  1. Ensure SSH pubkey is setup by following https://docs.gitlab.com/ee/ssh/
  2. Check source port for SSH here:

docker inspect gitlab | jq '.[0].NetworkSettings.Ports."22/tcp"'

  1. This .ssh/config worked for me

Host git
Hostname 127.0.0.1
User git
Identityfile ~/.ssh/id_ed25519
Port 922

  1. Then clone like this:

git clone [email protected]:root/my-awesome-project.git

Pandas Numpy and Matplotlib

I have created demo code with Jupyter Notebook, which can be viewed here: https://github.com/marcuspaget/pythonDSFromScratch/blob/master/PandasDemo.md

Panda – Python Data Analysis Library

Quick install with:

pip install pandas

Python’s answer to R’s DataFrames for data manipulation

Providing tools to read and write data between data structures and different formats: CSV and text files, Microsoft Excel, SQL databases, and the fast HDF5 format;

Easily to manipulate, slice and dice data, with integrated indexing.

Possible to convert HDF5 to HDFS for ingestion in Hadoop

Time series-functionality:

  • Date range generation & modification
  • Frequency conversion
  • Moving window statistics
  • Join time series without losing data

Numpy – Python Number Library

pip install numpy
Create, manipulate , slice and run ops i.e std, mean, min, max, etc
Please see link at top for examples

Matplotlib – Python plotting and figures

pip install matplotlib
Graph data from lists, dataframes, etc.

For example:


Matplotlib output example

I have created demo code with Jupyter Notebook, which can be viewed here: https://github.com/marcuspaget/pythonDSFromScratch/blob/master/PandasDemo.md

R – Sum by category with tapply

Good use of R’s tapply function to summary data ..

## read a csv file into a table called x – the first row contains column names

x<-read.table("2014-tax.csv",sep=",",header=T)

## In my instance column names are Item,Amount,Cat,Month,Who
## split out by Who

bob<-x[x$Who == "bob",] jane<-x[x$Who == "jane",]

## Spin around each row (obs) and sum the Amount

print(tapplybob$Amount,bob$Cat,sum))

## Typical output for bob #

#  books  equipment licences stationery  supplies    telephone

# 303.00 694.27 132.00 345.50 96.00 30.00
#


# Then for jane

print(tapply(jane$Amount,jane$Cat,sum)) 

# books equipment licences stationery supplies telephone

# 163.0 583.0 348.0 678.4 11543.0 NA
#
#

Arduino Hacking

Whilst on 5 week long service leave – created my first Arduino Hack to control an LCD (Liquid Crystal Display) I had lying around for last 6 years 🙂

Please see video below and all code is here:

https://github.com/marcuspaget/ArduinoLCD

Here is the video that shows all the steps, please feel free to leave comments below or on youtube if keen to discuss.

Python Dict

Sample code for working with Python Dicts

# init users list of dicts, print out id 1, then init friends list of tuples

users = [

     { "id": 0, "name": "Bob" },
{ "id": 1, "name": "Dunn" },
{ "id": 2, "name": "Sue" },
{ "id": 3, "name": "Chi" },
{ "id": 4, "name": "Thor" },
{ "id": 5, "name": "Clive" },
{ "id": 6, "name": "Hicks" },
{ "id": 7, "name": "Devin" },
{ "id": 8, "name": "Kate" },
{ "id": 9, "name": "Klein" },
{ "id": 10, "name": "Jen" }

]

i=0
for user in users:
if(users[i]["name"]=="Bob"):
print("Bob ID: ",users[i]["id"])
i+=1

friends= [(0, 1), (0, 2), (1, 2), (1, 3), (2, 3), (3, 4), (4, 5), (5, 6), (5, 7), (6, 8), (7, 8), (8, 9)]

# spin through all users and create empty list to store list, then populate

for user in users:
user["friends"]=[]

# populate empty list with all left side of tuple with right and vice versa

for i,j in friends:
users[i]["friends"].append(users[j])
users[j]["friends"].append(users[i])

# function to return length based on passed in user

def number_of_friends(user):     
"""how many friends does _user_ have?"""
return len(user["friends"]) # length of friend_ids list

# total up all friends

total_connection = sum(number_of_friends(user)                         
for user in users) # 24

# grab number of users

num_users = len(users)

avg_connections = total_connection / num_users # 2.4

# create a list (user_id, number_of_friends)

num_friends_by_id = [(user["id"], number_of_friends(user)) for user in users] 

print(sorted(num_friends_by_id,key=lambda pair: pair[1], reverse=True))

# Output – largest to smallest

[(1, 3), (2, 3), (3, 3), (5, 3), (8, 3), (0, 2), (4, 2), (6, 2), (7, 2), (9, 1), (10, 0)]

Powershell Intro

In recent years it has become necessary to learn Powershell. This is for a number of reasons, but predominately because certain automations are only possible via it. More on that later.

First – how to run powershell … simply windows key + R and powershell.

Or Windows key and start typing powershell – select Windows PowerShell ISE. This provides an IDE.

Variables

$single_quoted_string = 'a '
$double_quoted_string = "a string – $x"
Write-Host $single_quote_string
Write-Host $double_quoted_string

# can enforce with [string]

## Integers
[int]$int_one = 1
$int_two = 2
$int_one + $int_two

## Arrays
[array]$arr = @(1,'str_var',5)
$arr += 'add to arr'
$array

## Hash tables
[hashtable]$htab = @{'key1' = 'value1'; 'key2' = 'value2'}
$htab

$htab.Get_Item('key1')
$htab.key1

$htab.Add('newkey',”new added variable”)

$htab.Set_Item(“key1”, “mod_val1”)
$htab.Remove('key1')

If/Then

$a = 1
if ($a -eq 1) {
Write-Host 'a equals 1'
} else {
Write-Host 'a is not equal to 1'
}

# also possible to elseif

} elseif ($a -eq 3) {

Switch

$a = 1

switch ($a) {
1 {“Value 1”}
2 {“Value 2”}
3 {“Value 3”}
default {“Value exceeds threshold.”}
}

$b = “X365”

switch -wildcard ($b) {
“Z*” {“Val Z”}
“Y*” {“Val Y”}
“X*” {“Val X”}
default {“Val outside parameters.”}
}

1 – Lessons begin

Welcome to Coding-School.com old school!

Some of the tips, techniques and code freely imparted here, I’ve searched the web previously and come back empty handed. Then set about hand crafting my own solutions – which you get completely for free!.

If you have found my website useful, please consider buying me a coffee below 😉

Cheers and enjoy,

Mark