Re: data manipulation (265 Views)
Reply
Super Advisor
lawrenzo_1
Posts: 560
Registered: ‎06-06-2003
Message 1 of 10 (265 Views)
Accepted Solution

data manipulation

Hello all,

Please could you provide some help / ideas:

I have a file that is updated everytime a job is started and finishes against an informix database. I would like to run some code against the file to determine the start and stop times, also I will develop the script to calculate the average run times.

the logfile entries are as follows:

start/end PID date time program user script option

S|1161480|20070720|1205|createWebOrdFile.4ge|cronlog|.|/usr/cs3/scripts/JDE/createWebOrdFile||
E|1161480|20070720|1205|createWebOrdFile.4ge|cronlog|.|/usr/cs3/scripts/JDE/createWebOrdFile||


what I would be looking for is the script o be run on the previous day for a 24hr period and detail every job that has run including start and stop times including the option the script was run with:

scriptname:4ge program name:start:stop:duration:option

any help will be much appreciated.

Thanks

Chris.

hello
Honored Contributor
Sandman!
Posts: 2,220
Registered: ‎01-13-2005
Message 2 of 10 (265 Views)

Re: data manipulation

Try the awk script below:

awk -F\| '{
if ($1=="S") {s[$2]=$8":"$5":"$3" "$4;str=$4}
else if ($1=="E") {e[$2]=$3" "$4;stp=$4}
}END{for(i in s) print s[i]":"e[i]":"str-stp}' file
Honored Contributor
Peter Nikitka
Posts: 1,575
Registered: ‎02-10-2003
Message 3 of 10 (265 Views)

Re: data manipulation

Hi,

I'm sorting the description of all of the fields of the logfile you gave:
1 start(S) end(E)
2 PID (1161480)
3 date (20070720)
4 time (1205) => is that HHMM ?
5 program (createWebOrdFile.4ge)
6 user (cronlog)
7 script (.)
8 option (/usr/cs3/....)
9,10 ignored

Please form your requested output format like this:
"script:"(7) "prog:"(5) ...

mfG Peter
The Universe is a pretty big place, it's bigger than anything anyone has ever dreamed of before. So if it's just us, seems like an awful waste of space, right? Jodie Foster in "Contact"
Honored Contributor
Sandman!
Posts: 2,220
Registered: ‎01-13-2005
Message 4 of 10 (265 Views)

Re: data manipulation

Actually Peter is spot-on. The sample input provided is sketchy as there isn't a one-to-one correspondence between the fields and its data. The data seems to be off by a few. Same goes for the output. Please provide a clearer example of the input and output.

~thanks
Acclaimed Contributor
James R. Ferguson
Posts: 21,184
Registered: ‎07-06-2000
Message 5 of 10 (265 Views)

Re: data manipulation

Hi Chris:

> ...previous day for a 24hr period...

Would you define what you mean, please. Will that definition include records that start on one day and end on another?

Regards!

...JRF...
Super Advisor
lawrenzo_1
Posts: 560
Registered: ‎06-06-2003
Message 6 of 10 (265 Views)

Re: data manipulation

Yes this will include defenitions that run one day into another however I will output to a daily file and put a condition that is a pid is not found then check the current audit log.

unless there are any other suggestions?

Thanks

Chris
hello
Super Advisor
lawrenzo_1
Posts: 560
Registered: ‎06-06-2003
Message 7 of 10 (265 Views)

Re: data manipulation

guys:

1 start(S) end(E)
2 PID (1161480)
3 date (20070720)
4 time (1205) => is that HHMM ?
5 program (createWebOrdFile.4ge)
6 user (cronlog)
7 is where the script is run from (.)
8 script
9,option ie <script> - T1 being store identifier.

option 7 can be ignored

Thanks
hello
Honored Contributor
Peter Nikitka
Posts: 1,575
Registered: ‎02-10-2003
Message 8 of 10 (265 Views)

Re: data manipulation

Ok,

my solution does not use any values of the End-entry except PID, date + time; checking of other fields is not provided. Incomplete lines (missing/additional fields) are ingored.
The PID will be used as a unique identifier - PIDs of different days will Be IGNORED (you told, that this is okay!). So runtime will be solely based on 'time'. If necessary, this could be handled more graceful.
The empty field 9 (of your example) is 'option'.
If you need any headers/prefixes, modify them at output in 'END'.

So let's try this:

awk -F'|' 'NF!=10 {next}
{idx=$2""}
$1=="S" {day[idx]=$3;stim[idx]=$4; scr[idx]=$8; prog[idx]=$5; opt[idx]=$9}
$1=="E" {if($3!=day[idx]) next
if(stim[idx]) etim[idx]=$4 }
END {for (r in etim) printf("script:%s prog:%s rt:%d-%d=%d opt=%s\n", scr[r],prog[r], etim[idx], stim[idx],etim[idx]-stim[idx], opt[idx])}' YOURFILE

mfG Peter
The Universe is a pretty big place, it's bigger than anything anyone has ever dreamed of before. So if it's just us, seems like an awful waste of space, right? Jodie Foster in "Contact"
Honored Contributor
Hein van den Heuvel
Posts: 6,588
Registered: ‎05-19-2003
Message 9 of 10 (265 Views)

Re: data manipulation

Hmmm,

The prior solutions seem to subtract two HHMM values to get a duration.
That will result in 41 minutes going from 1259 to 1300 instead of the 1 minute one might reasonably expect.
Also, the prior solutions specifically remember stuff from the start record which is also available in the end record.

I would suggest:

$ awk -f test.awk test.txt

where
----------- test.awk ----------
BEGIN { FS="|"; OFS=":" }
function minutes (time) { h = int(time/100); m = time - h*100; return h*60 + m }
/^S/ {s[$2]=$4}
/^E/ {print $8 " " $5,s[$2],$4,minutes($4)-minutes(s[$2])}

Sample data
---------------- test.txt -----------
S|1161480|20070720|1259|createWebOrdFile.4ge|cronlog|.|test||
E|1161480|20070720|1300|createWebOrdFile.4ge|cronlog|.|test||
S|1161483|20070720|1205|createWebOrdFile.4ge|cronlog|.|xxxxxxxxxxxxx||
S|1161484|20070720|1205|createWebOrdFile.4ge|cronlog|.|yyyyyyyyyyyyy||
E|1161484|20070720|1255|createWebOrdFile.4ge|cronlog|.|aaaaaaaaaaaaa||
E|1161483|20070720|1310|createWebOrdFile.4ge|cronlog|.|bbbbbbbbbbbbb||


Results in:
--------------------------
test createWebOrdFile.4ge:1259:1300:1
aaaaaaaaaaaaa createWebOrdFile.4ge:1205:1255:50
bbbbbbbbbbbbb createWebOrdFile.4ge:1205:1310:65


Hope this helps some,
Hein van den Heuvel (at gmail dot com)
HvdH Performance Consulting
Super Advisor
lawrenzo_1
Posts: 560
Registered: ‎06-06-2003
Message 10 of 10 (265 Views)

Re: data manipulation

thanks guys,

some good solutions here which I will use.

Chris
hello
The opinions expressed above are the personal opinions of the authors, not of HP. By using this site, you accept the Terms of Use and Rules of Participation.