## R – Quickly Graph

How to quickly paste into R and get some results on a Mac (just change pipe/paste for diff OS)

`df<-read.table(pipe("pbpaste"),sep=" ")names(df)<-c("date","time","count")df\$dtg<-strptime(paste(df\$date,df\$time,sep=" "),"%Y-%m-%d %H:%M:%S")plot(df\$dtg,df\$count,las=2)`

## Python Intro

I have been working with Python and spent many hours of study, but best way to learn is to teach ðŸ™‚

As per my other guides into PHP, Perl, etc .. I will gradually write some intro, intermediate and advanced topics.

As a quick warm up ðŸ™‚

`\$ python -c 'print("Hi World")'Hi World`

`\$ python -c 'print(type("test"))'<type 'str'>`

`\$ python -c 'print(3**7)'2187`

Quick bit of OO – defining method

``````class Car:

# class attrib
category = "Vehicle"

# instance attrib
def __init__(self,name,make,model):
self.name=name
self.make=make
self.model=model

# instance method
def desc(self):
return "Name: {} is made by {} and is the {} body shape".format(self.name,self.make,self.model)``````

Then creating an object from this class and calling methods

`fordLaser=Car("Ford Laser","Ford","Hatchback")print(fordLaser.desc())`

## Datascience – Ruby

Just having a play with ruby and thought I’d try to simulate summary(x):

`irb(main):001:0> y=[]irb(main):002:0> def x();rand(9999);end;=> :xirb(main):003:0> def summary(x=0); puts "min: #{x.min} max: #{x.max} mean: #{(x.sum(0.0)/x.size).round(2)}"; end=> :summaryirb(main):004:0> 99.times do; y<<x;end=> 99irb(main):005:0> summary(y)min: 23 max: 9851 mean: 5127.23`

## Git clone – gitlab docker

HowTo: clone from a gitlab server running inside a docker container

1. Ensure SSH pubkey is setup by following https://docs.gitlab.com/ee/ssh/
2. Check source port for SSH here:

`docker inspect gitlab | jq '.[0].NetworkSettings.Ports."22/tcp"'`

1. This .ssh/config worked for me

`Host gitHostname 127.0.0.1User gitIdentityfile ~/.ssh/id_ed25519Port 922`

1. Then clone like this:

`git clone [email protected]:root/my-awesome-project.git`

## Pandas Numpy and Matplotlib

I have created demo code with Jupyter Notebook, which can be viewed here: https://github.com/marcuspaget/pythonDSFromScratch/blob/master/PandasDemo.md

Panda – Python Data Analysis Library

Quick install with:

`pip install pandas`

Python’s answer to R’s DataFrames for data manipulation

Providing tools to read and write data between data structures and different formats: CSV and text files, Microsoft Excel, SQL databases, and the fast HDF5 format;

Easily to manipulate, slice and dice data, with integrated indexing.

Possible to convert HDF5 to HDFS for ingestion in Hadoop

Time series-functionality:

• Date range generation & modification
• Frequency conversion
• Moving window statistics
• Join time series without losing data

Numpy – Python Number Library

`pip install numpy`
`Create, manipulate , slice and run ops i.e std, mean, min, max, etc`
`Please see link at top for examples`

Matplotlib – Python plotting and figures

`pip install matplotlib`
`Graph data from lists, dataframes, etc.For example:`

## R – Sum by category with tapply

Good use of R’s tapply function to summary data ..

## read a csv file into a table called x – the first row contains column names

`x<-read.table("2014-tax.csv",sep=",",header=T)`

## In my instance column names are Item,Amount,Cat,Month,Who
## split out by Who

`bob<-x[x\$Who == "bob",] jane<-x[x\$Who == "jane",]`

## Spin around each row (obs) and sum the Amount

`print(tapplybob\$Amount,bob\$Cat,sum))`

## Typical output for bob #

```#  books  equipment licences stationery  supplies    telephone
#  303.00  694.27     132.00   345.50      96.00       30.00 #
```

# Then for jane

`print(tapply(jane\$Amount,jane\$Cat,sum)) `

```# books equipment licences stationery supplies telephone
# 163.0  583.0      348.0     678.4    11543.0    NA
# #
```

## Arduino Hacking

Whilst on 5 week long service leave – created my first Arduino Hack to control an LCD (Liquid Crystal Display) I had lying around for last 6 years ðŸ™‚

Please see video below and all code is here:

https://github.com/marcuspaget/ArduinoLCD

Here is the video that shows all the steps, please feel free to leave comments below or on youtube if keen to discuss.

## Python Dict

Sample code for working with Python Dicts

# init users list of dicts, print out id 1, then init friends list of tuples

users = [

`     { "id": 0, "name": "Bob" },     { "id": 1, "name": "Dunn" },     { "id": 2, "name": "Sue" },     { "id": 3, "name": "Chi" },     { "id": 4, "name": "Thor" },     { "id": 5, "name": "Clive" },     { "id": 6, "name": "Hicks" },     { "id": 7, "name": "Devin" },     { "id": 8, "name": "Kate" },     { "id": 9, "name": "Klein" },     { "id": 10, "name": "Jen" } `

]

`i=0for user in users:     if(users[i]["name"]=="Bob"):         print("Bob ID: ",users[i]["id"])     i+=1friends= [(0, 1), (0, 2), (1, 2), (1, 3), (2, 3), (3, 4), (4, 5), (5, 6), (5, 7), (6, 8), (7, 8), (8, 9)]`

# spin through all users and create empty list to store list, then populate

`for user in users:     user["friends"]=[]`

# populate empty list with all left side of tuple with right and vice versa

`for i,j in friends:     users[i]["friends"].append(users[j])     users[j]["friends"].append(users[i])`

# function to return length based on passed in user

`def number_of_friends(user):         """how many friends does _user_ have?"""    return len(user["friends"]) # length of friend_ids list`

# total up all friends

`total_connection = sum(number_of_friends(user)                              for user in users) # 24 `

# grab number of users

`num_users = len(users)avg_connections = total_connection / num_users # 2.4 `

# create a list (user_id, number_of_friends)

`num_friends_by_id = [(user["id"], number_of_friends(user)) for user in users] print(sorted(num_friends_by_id,key=lambda pair: pair[1],                      reverse=True))`

# Output – largest to smallest

`[(1, 3), (2, 3), (3, 3), (5, 3), (8, 3), (0, 2), (4, 2), (6, 2), (7, 2), (9, 1), (10, 0)]`

## Powershell Intro

In recent years it has become necessary to learn Powershell. This is for a number of reasons, but predominately because certain automations are only possible via it. More on that later.

First – how to run powershell … simply windows key + R and powershell.

Or Windows key and start typing powershell – select Windows PowerShell ISE. This provides an IDE.

Variables

\$single_quoted_string = 'a '
\$double_quoted_string = "a string – \$x"
Write-Host \$single_quote_string
Write-Host \$double_quoted_string

# can enforce with [string]

## Integers
[int]\$int_one = 1
\$int_two = 2
\$int_one + \$int_two

## Arrays
[array]\$arr = @(1,'str_var',5)
\$array

## Hash tables
[hashtable]\$htab = @{'key1' = 'value1'; 'key2' = 'value2'}
\$htab

\$htab.Get_Item('key1')
\$htab.key1

\$htab.Set_Item(“key1”, “mod_val1”)
\$htab.Remove('key1')

If/Then

\$a = 1
if (\$a -eq 1) {
Write-Host 'a equals 1'
} else {
Write-Host 'a is not equal to 1'
}

# also possible to elseif

} elseif (\$a -eq 3) {

Switch

ï»¿\$a = 1

switch (\$a) {
1 {“Value 1”}
2 {“Value 2”}
3 {“Value 3”}
default {“Value exceeds threshold.”}
}

\$b = “X365”

switch -wildcard (\$b) {
“Z*” {“Val Z”}
“Y*” {“Val Y”}
“X*” {“Val X”}
default {“Val outside parameters.”}
}

