Remove content from gzips

[problem]

You need to remove content from zip files.

Effectively unzip, remove the pattern and re-zip.

[/problem]

[solution]

A collection of shell commands, to perform the unzip, swap and rezip.

[/solution]

[example]

* First to find zips with those patterns in (substitute pattA and B with your patterns, on each of the commands below):


for na in *gz
do
[[ $(gzip -c -d $na | egrep -ic 'pattA|pattB') -ne 0 ]] && {
echo $na
}
done > /tmp/filelist.out

* Then to test removing the offending lines:


for na in $(</tmp/filelist.out)
do
gzip -c -d $na | egrep -iv 'pattA|pattB' |
gzip --best > /tmp/replace.gz
echo going to mv /tmp/replace.gz $na
ls -ld /tmp/replace.gz $na

done

* Now to the actual move:


for na in $(</tmp/filelist.out)
do
gzip -c -d $na | egrep -iv 'pattA|pattB' |
gzip --best > /tmp/replace.gz
mv /tmp/replace.gz $na

done

* Then a retest:


for na in $(</tmp/filelist.out)
do
gzip -c -d $na | egrep -i 'pattA|pattB' |
sed "s/^/$na: /"

done

[/example]

[reference]

[tags]gzip, sed, Unix Coding School[/tags]

[/reference]

Cut and paste a zip file? You bet!

[problem]

You have a shell open on two UNIX boxes, but scp, ftp, etc, etc is locked down?

Well they left Perl!!! 🙂

[/problem]

[solution]

I recognise this could be messy for massive files – but it works beautifully on binaries, images, etc.

I am demo’ing it here on tar gzipped output. It produces some errors, but still recreates the files.

[/solution]

[example]

  • First to prove the files with checksums (11171, 28913)

  • $ find test -type f -exec sum {} ; -ls
    11171 18
    294632 20 -rwxr--r-- 1 marcus marcus 17461 Feb 15 21:49 test/testA/mun-weir.jpg
    28913 19
    294633 20 -rw-r--r-- 1 marcus marcus 18505 Feb 15 21:49 test/testB/perth-swan-river.jpg

  • Now to generate the text for copying – using base64 encoding to convert binary to text

  • tar zcvf - test | perl -MMIME::Base64 -e 'while() {
    $str.=(encode_base64($_)); } print "$strn";'

    This produces a listing of the files/dirs and then starting from H4, the actual text version of the binary output. I have truncated this – as it gets quite big. You can redirect this into a file, if using cygwin – then just open it with wordpad.

    test/
    test/testA/
    test/testA/mun-weir.jpg
    test/testB/
    test/testB/perth-swan-river.jpg
    H4sIAGVX1EUAA+z8BWxlwfMuBl6PmZnZvmYaMzMzMzMzMzMz22P7mpnZ4zHD2B4zM+OYGXZ+/5dk
    k0hRFCl5u6t9daVzjlTd1V931anuOlf63Mxd3ZgA/8/K93/Cyc7+nzsrJwvz//r+PwuA+Ts7B/t3
    dlZWFk7Ad2ZmNmZmAAn7/8O4/kvcXd2MXUhIAPbGLqburv/H7f7P9P8/Km7/8f9/LsL/z0XB/yX/
    s7P+8z8LG+f3/+H//x7yv/K/vbsDg6e5tQujjZPl/61j/Jf/2dj+D/zPxvKdg/1/5382TjZ2AMn3
    /1tR/B/I/5/7/2vlaweAIiMhLQEAAwMDSP37Ab52AVjiXtYWAIC8PID2XyOY/6ntOkAUAAkOAQEB
    DvnvAgkJCQWNAAP9T5Dh4WERUJHR0FCRUVHRsQiw0THwMFFRcUhx8AgJiYmJ0bHJgGREQAIiYqL/
    GAH71xUaCg==
    GgkGBokIAxWD6P+yfPUDUGEAYWD34GBkgG+oYOCoYF9DAMJ/GKH/M4v/t3wDh/gHEgwG9p9aHgXw

  • Now to extract it
  • First we paste it into tar.uu – which could be anything or you can just do cat | perl and paste to stdin.


    [marcus@zion ~]$ cat tar.uu | perl -MMIME::Base64 -e 'while() { $str.=(decode_base64($_)); } print "$strn";' | tar zxvf -

    gzip: stdin: decompression OK, trailing garbage ignored
    test/
    test/testA/
    test/testA/mun-weir.jpg
    test/testB/
    test/testB/perth-swan-river.jpg
    tar: Child returned status 2
    tar: Error exit delayed from previous errors

    As you can see produces a bit of garbage.

  • Prove the files are reproduced – with sums (11171, 28913)

  • [marcus@zion ~]$ find test -type f -exec sum {} ; -ls
    28913 19
    1555 24 -rw-r--r-- 1 marcus adm 18505 Feb 15 21:49 test/testB/perth-swan-river.jpg
    11171 18
    1553 24 -rwxr--r-- 1 marcus adm 17461 Feb 15 21:49 test/testA/mun-weir.jpg

    And you are done. 🙂

    [/example]

    [reference]

    [tags]cut and paste binary files, base64, Unix Coding School[/tags]

    [/reference]

    Optical dump file – od

    [problem]

    Whenever you cat a file, it shows nothing!

    Or it destroys your terminal. 🙂

    … or when you diff to files – the look identical – but it reports them as not.

    [/problem]

    [solution]

    You probably have control characters in the file, or no newline at the end of the file.

    The example shows how to run an optical dump on a file.

    [/solution]

    [example]

    # optical dump of a file, useful for seeing control chars, etc and lines with no newline.

    od -c filename

    [/example]

    [reference]

    [tags]Optical dump, od, Unix Coding School[/tags]

    [/reference]

    Find secrets – undocumented feature

    [problem]

    You want to find a pattern in a files, but have the file name and pattern printed.

    If you run find . -type f -exec grep -i “pattern” {} ; it will only show the pattern turning up – i.e not the file name as well.

    If you run find . -type f -exec grep -il “pattern” {} ; it will only show the file name turning up – i.e not the line containing the pattern as well! 🙂

    [/problem]

    [solution]

    Maybe you have seen this solution before, but AFAIK it is completely undocumented.

    I discovered it whilst trying to do this very thing and I just thought, hmmm … I wonder. 🙂

    [/solution]

    [example]

    To force grep to print the line containing the pattern and the file name, we pass /dev/null as another argument. Find sees this as multiple arguments, so prints the file name and line containing pattern.

    • The normal basic find

    • $ find . -type f -exec grep -i author {} ;

      <ADDRESS CLASS=”doc-author”>The OpenLDAP Project <<A target=_blank HREF=”http://www.openldap.org/”>http://www.openldap.org/</A>></ADDRESS>
      <META name=”Author” content=”TechRock”>
      <META HTTP-EQUIV=”AUTHOR” content=”TechRock”>

    • find looking for just the file name

    • $ find . -type f -exec grep -il author {} ;

      ./cs/ldap-head.inc
      ./meta.inc

    • Undocumented feature, print file name and line content

    • $ find . -type f -exec grep -i author {} /dev/null ;

      ./cs/ldap-head.inc:<ADDRESS CLASS=”doc-author”>The OpenLDAP Project <<A target=_blank HREF=”http://www.openldap.org/”>http://www.openldap.org/</A>></ADDRESS>
      ./meta.inc:<META name=”Author” content=”TechRock”>
      ./meta.inc:<META HTTP-EQUIV=”AUTHOR” content=”TechRock”>

    [/example]

    [reference]

    [tags]unix find, find undocumented features, find secrets, Unix Coding School[/tags]

    [/reference]

    Synchronize directories between firewalled hosts

    [problem]

    You have an admin host (ADMIN), which can see two servers (hostA and hostB) behind firewalls.

    You are not allowed to scp/rdist, etc from hostA or B to ADMIN. Nor are you allowed to scp/rdist, etc between hostA and B.

    ADMIN can scp, etc though to both hosts. Now in this real life example, updates are made directly to hostA – but not hostB. So they need to be synchronized somehow, looking at the real files and performing a pull to ADMIN from hostA and then a push to hostB.

    But then we only want new files …

    [/problem]

    [solution]

    In this script at the example, I pull from hostA and push to hostB. But only “look” for files that have updated, since last time I looked. 🙂

    [/solution]

    [example]

    It starts off just setting variables, creating temporary directories on ADMIN host.

    Then at line 26 I obtain a list of files from hostA, that have been updated since last time I ran this script.

    At line 29 I spin around this list and copy the given file to the temp directory (line 34). Notice I use ${file##*/} this strips off the path – so I just get the file (i.e. bob.gif not /dirABCD/…./bob.gif).

    I then copy this file (line 36) back out to the full path on hostB. Build a test and check the file exists on hostB – lines 38 – 42. Then remove the temporary copy of the file (line 44).

    Spin back round for the next file at line 45. Lastly cleaning up.

    Whilst this is not as efficient as rdist (opening one ssh connection), it will only copy updated files. So not a tar of the whole directory, etc. Plus by pepper potting between hostA and hostB – there is a few seconds delay between each file update – so should n’t overload either host.


    1:#!/bin/zsh
    2:
    3:tempdir="/tmp/sync.$$"
    4:filelist="files.lst"
    5:basedir="/doc_root"
    6:touchfile="/tmp/.sync"
    7:dtg=$(perl -M'English' -e 'print $BASETIME."n";')
    8:report="/tmp/sync.$dtg"
    9:
    10:
    11:[ -d $tempdir ] && {
    12: echo "$0: $tempdir already exists - exiting ..."
    13: exit 1
    14:}
    15:
    16:mkdir $tempdir || {
    17: echo "$0: mkdir $tempdir failed - exiting ..."
    18: exit 1
    19:}
    20:
    21:cd $tempdir || {
    22: echo "$0: $tempdir not accessible - exiting ..."
    23: exit 1
    24:}
    25:
    26:/bin/ssh hostA /bin/find $basedir -type f -a
    -newer $touchfile > $filelist 2> /dev/null
    27:#/bin/ssh hostA /bin/find $basedir -type f -a
    -mtime -1 > $filelist 2> /dev/null
    28:
    29:for file in $(<$filelist)
    30:do
    31: echo "copying $file"
    32: echo "copying $file" >> $report
    33:
    34: /bin/scp hostA:$file . 2> /dev/null
    35:
    36: /bin/scp ${file##*/} hostB:$file 2> /dev/null
    37:
    38: check=$(/bin/ssh hostB /bin/ls -ld "${file}" 2>&1)
    39:
    40: [[ $(echo $check | grep -c "No such") -gt 0 ]] && {
    41: echo "$0:$dtg:$file failed" >> $report
    42: }
    43:
    44: /bin/rm ${file##*/}
    45:done
    46:
    47:/bin/ssh hostA /bin/touch $touchfile 2> /dev/null
    48:
    49:/bin/rm $filelist
    50:
    51:cd /tmp
    52:
    53:rmdir $tempdir
    54:
    55:exit 0

    See reference to download this code.

    [/example]

    [reference]

    [tags]synchronized files, pull and push files, unix scp, scp, UNIX Coding School[/tags]

    [/reference]

    UNIX chat shell code

    [problem]

    As most UNIX bods know, with fairly recent audit lock downs – talkd has been disabled.

    But you want to chat remotely with someone, i.e. another techie bod! 🙂

    [/problem]

    [solution]

    Now where is the fun in that! 🙂

    Don’t know about you, but quite often I’ve needed to talk to other people remotely and don’t want to hold a phone or use hands free.

    See the example.

    [/solution]

    [example]

    Here is a quick and dirty solution, that works with bash – should work with zsh too.

    I welcome any enhancements or suggestions, via the comments.


    #!/bin/bash

    trap 'kill $tailpid;echo "*$nick leaves chat*" >> ${chatdir}/discussion;
    rm ${chatdir}/.${nick};exit 0' 2

    chatdir="/var/tmp/unixchat"

    [ -d $chatdir ] || {
    mkdir $chatdir || (echo "failed to mkdir $chatdir";exit 1)
    }

    chmod 777 /var/tmp/unixchat

    while [ $(echo X${nick} | grep -c "^X$") -eq 1 ]
    do
    echo -n "Nick: "
    read nick

    [ -f "${chatdir}/.${nick}" ] && {
    echo "$nick already exists, try another";nick=""
    }

    done

    touch ${chatdir}/.${nick}

    [ ! -f ${chatdir}/discussion ] && {
    touch ${chatdir}/discussion
    chmod 666 ${chatdir}/discussion
    }

    echo "*$nick joins chat*" >> ${chatdir}/discussion

    tail -1000f ${chatdir}/discussion&
    tailpid=$!

    echo ">> type quit to exit <<"

    while [ $(echo X${entry} | grep -c "^Xquit$") -ne 1 ]
    do
    read entry
    echo "[${nick}@`date '+%d-%b %H:%M:%S'`] says: $entry" >> ${chatdir}/discussion&

    done

    kill $tailpid; rm ${chatdir}/.${nick}

    echo "*$nick leaves chat*" >> ${chatdir}/discussion

    exit 0

    Here is a screen shot:

    UNIX chat code - simple solution written in shell

    [/example]

    [reference]

    [tags]UNIX, Chat code, Shell Script[/tags]

    [/reference]

    extract specific line number from file using sed

    [problem]

    You want to extract a specific line number from a file.

    Or a range of numbers from a file.

    [/problem]

    [solution]

    Once you have your line number – see grep, we can extract lines around the pattern.

    To do this sed (stream editor) can be used to print just desired lines – see the example which says don’t print all lines (-n); start at line 456 and finish at line 466 – print:

    [/solution]

    [example]

    This says don’t print all lines (-n); start at line 456 and finish at line 466 – print:


    sed -n 456,466p filename

    Also with sed, we can say delete specific lines – in this case remove lines 5 to 10:


    sed 5,10d filename

    That’s not all sed can accept patterns, as start/end identifiers:


    sed /start_pattern/,/end_pattern/d filename

    [/example]

    [reference]

    [tags]UNIX sed, UNIX, sed, UNIX Coding School[/tags]

    [/reference]

    Debugging Sendmail

    [problem]

    Issues with sendmail, tacking on another domain name, etc.

    You want to manually test you mail host, from the command line.

    [/problem]

    [solution]

    Whilst debugging sendmail issues, came across this great test.

    You can run this from the command line. See example.

    [/solution]

    [example]


    telnet mailrelay.domain 25

    .... answer from mail relay .....

    helo unix hostname
    mail from: root@unix hostname
    rcpt to: email test@something out
    data

    mail test from unix
    .

    [/example]

    [reference]

    [tags]sendmail debugging, debugging sendmail, sendmail, debugging, UNIX Coding School[/tags]

    [/reference]

    unix sort

    [problem]

    You want to perform a sort:

    … But ignore leading blanks;
    … Perform numerical sort;
    … Offset it to the third column;
    … Reverse it;

    [/problem]

    [solution]

    As always with UNIX we supply the arguments, with letters.

    See the example tag.

    [/solution]

    [example]

    8 flags to the UNIX sort command.

    1) -b # ignores leading blanks.

    2) -f # ignore case

    3) -n # numerical

    4) -r # reverse

    5) –key=pos# # sort by key or +# in Solaris, etc take note of -t though for field separator

    6) -o # output to file, useful for doing a sort -o fileA fileA – rather than redirect than can overwrite.

    7) -u # strip unique lines. You can also use the uniq command, which includes a -c option to count repeating lines.

    8) -i # ignore non-printable chars

    [/example]

    [reference]

    [tags]UNIX Sort, UNIX, sort, UNIX Coding School[/tags]

    [/reference]

    LDAP to SQL Perl code

    [problem]

    Whilst working on the automatic production of web statistics – came across the following problem:

    “How do I get relational data from an Hierarchical structure?”

    [/problem]

    [solution]

    It didn’t take long to realize – I’d have to use PHP to talk to LDAP, pull off records & upload into a series of tables, using the cn as primary key. Which then could be queried relationally. Pulling off large, queries and repeatedly transcending LDAP trees is pretty slow – so I built my LDAP to SQL engine, by flattening dns into table names. Then used PHP scripts to query produce
    daily snap shots.

    This is the Perl port of the PHP version. Requires some setting up on db side, but invaluable once implemented.

    Perl LDAP to SQL – Traverses LDAP trees and spits out SQL:
    Now on github

    Please leave a comment if you want help with this.

    [/solution]