[Zope] parsing a textfile line by line -- Forgive the long me ssage

Farrell, Troy troy.farrell@wilcom.com
Tue, 20 Feb 2001 09:30:58 -0600


I am writing a log file parser to do (buzzword alert) "Streaming Media
Metrics".  I have many logfiles from streaming video/audio servers.  My
Streaming Provider makes the logs available on an ftp server.  I import t=
he
logfiles (by hand for now, soon by Xron), and parse them with this python
script (not external method):

"""
  This is a set of Python functions that parse
  and report the information contained in a
  NetShow Server log files.
"""
# We begin to sort the lines by spaces.
# Unlike Real Media Servers, NetShow Server log
# files are entirely separated by spaces.  This
# makes the code really easy.

import string

# number of bad lines in the log file
badline=3D0

for line in string.split(logfile,"\n"):
  # process each line of the log

  e =3D string.split(line)
  # e is a list of each element, split by the spaces.
  loe =3D len(e)
  # crunch the number of elements in the list

  if (loe =3D=3D 44):

    # see if for some weird reason, the line is a comment line:
    if e[0][0] =3D=3D '#':
      pass
    else:
      c_ip      =3D e[0]
      date      =3D e[1]
      time      =3D e[2]
      c_dns     =3D e[3]
      cs_uri_stem =3D e[4]
      c_starttime =3D e[5]
      x_duration  =3D e[6]
      c_rate      =3D e[7]
      c_status    =3D e[8]
      c_playerid  =3D e[9]
      c_playerversion =3D e[10]
      c_playerlanguage =3D e[11]
      cs_user_agent =3D e[12]
      cs_referer =3D e[13]
      c_hostexe   =3D e[14]
      c_hostexever =3D e[15]
      c_os         =3D e[16]
      c_osversion  =3D e[17]
      c_cpu        =3D e[18]
      filelength =3D e[19]
      filesize   =3D e[20]
      avgbandwidth =3D e[21]
      protocol   =3D e[22]
      transport  =3D e[23]
      audiocodec =3D e[24]
      videocodec =3D e[25]
      channel_url =3D e[26]
      sc_bytes   =3D e[27]
      c_bytes    =3D e[28]
      s_pkts_sent =3D e[29]
      c_pkts_received =3D e[30]
      c_pkts_lost_client =3D e[31]
      c_pkts_lost_net =3D e[32]
      c_pkts_lost_cont_net =3D e[33]
      c_resendreqs =3D e[34]
      c_pkts_recovered_ecc =3D e[35]
      c_pkts_recovered_resent =3D e[36]
      c_buffercount =3D e[37]
      c_totalbuffertime =3D e[38]
      c_quality =3D e[39]
      s_ip =3D e[40]
      s_dns =3D e[41]
      s_totalclients =3D e[42]
      s_cpu_util =3D e[43]
      #cs_uri_query =3D e[44]

      #insert all that junk into PostgreSQL
#      context.sqlStreamInsertIntoNetShowRaw(c_ip =3D c_ip, date =3D date=
, time
=3D time, c_dns =3D c_dns, cs_uri_stem =3D cs_uri_stem, c_starttime =3D c=
_starttime,
x_duration =3D x_duration, c_rate =3D c_rate, c_status =3D c_status, c_pl=
ayerid =3D
c_playerid, c_playerversion =3D c_playerversion, c_playerlanguage =3D
c_playerlanguage, cs_user_agent =3D cs_user_agent, cs_referer =3D cs_refe=
rer,
c_hostexe =3D c_hostexe, c_hostexever =3D c_hostexever, c_os =3D c_os, c_=
osversion
=3D c_osversion, c_cpu =3D c_cpu, filelength =3D filelength, filesize =3D=
 filesize,
avgbandwidth =3D avgbandwidth, protocol =3D protocol, transport =3D trans=
port,
audiocodec =3D audiocodec, videocodec =3D videocodec, channel_url =3D cha=
nnel_url,
sc_bytes =3D sc_bytes, c_bytes =3D c_bytes, s_pkts_sent =3D s_pkts_sent,
c_pkts_received =3D c_pkts_received, c_pkts_lost_client =3D c_pkts_lost_c=
lient,
c_pkts_lost_net =3D c_pkts_lost_net, c_pkts_lost_cont_net =3D
c_pkts_lost_cont_net, c_resendreqs =3D c_resendreqs, c_pkts_recovered_ecc=
 =3D
c_pkts_recovered_ecc, c_pkts_recovered_resent =3D c_pkts_recovered_resent=
,
c_buffercount =3D c_buffercount, c_totalbuffertime =3D c_totalbuffertime,
c_quality =3D c_quality, s_ip =3D s_ip, s_dns =3D s_dns, s_totalclients =3D
s_totalclients, s_cpu_util =3D s_cpu_util)

      #
      # Debugging print statements.  Ughh.  That is a bunch of print
statements.
      #

      #print "c_ip %s\n" % c_ip
      #print "date %s\n" % date
      #print "time %s\n" % time
      #print "c_dns %s\n" % c_dns
      #print "cs_uri_stem %s\n" % cs_uri_stem
      #print "c_starttime %s\n" % c_starttime
      #print "x_duration %s\n" % x_duration
      #print "c_rate %s\n" % c_rate
      #print "c_status %s\n" % c_status
      #print "c_playerid %s\n" % c_playerid
      #print "c_playerversion %s\n" % c_playerversion
      #print "c_playerlanguage %s\n" % c_playerlanguage
      #print "cs_user_agent %s\n" % cs_user_agent
      #print "cs_referer %s\n" % cs_referer
      #print "c_hostexe %s\n" % c_hostexe
      #print "c_hostexever %s\n" % c_hostexever
      #print "c_os %s\n" % c_os
      #print "c_osversion %s\n" % c_osversion
      #print "c_cpu %s\n" % c_cpu
      #print "filelength %s\n" % filelength
      #print "filesize %s\n" % filesize
      #print "avgbandwidth %s\n" % avgbandwidth
      #print "protocol %s\n" % protocol
      #print "transport %s\n" % transport
      #print "audiocodec %s\n" % audiocodec
      #print "videocodec %s\n" % videocodec
      #print "channel_url %s\n" % channel_url
      #print "sc_bytes %s\n" % sc_bytes
      #print "c_bytes %s\n" % c_bytes
      #print "s_pkts_sent %s\n" % s_pkts_sent
      #print "c_pkts_received %s\n" % c_pkts_received
      #print "c_pkts_lost_client %s\n" % c_pkts_lost_client
      #print "c_pkts_lost_net %s\n" % c_pkts_lost_net
      #print "c_pkts_lost_cont_net %s\n" % c_pkts_lost_cont_net
      #print "c_resendreqs %s\n" % c_resendreqs
      #print "c_pkts_recovered_ecc %s\n" % c_pkts_recovered_ecc
      #print "c_pkts_recovered_resent %s\n" % c_pkts_recovered_resent
      #print "c_buffercount %s\n" % c_buffercount
      #print "c_totalbuffertime %s\n" % c_totalbuffertime
      #print "c_quality %s\n" % c_quality
      #print "s_ip %s\n" % s_ip
      #print "s_dns %s\n" % s_dns
      #print "s_totalclients %s\n" % s_totalclients
      #print "s_cpu_util %s\n" % s_cpu_util
      ##print "cs_uri_query %s\n" % cs_uri_query
  else:
    # loe !=3D 44
    # we have an error
    if (loe > 0):
      # see if for some weird reason, the line is a comment line:
      if e[0][0] =3D=3D '#':
        pass
      else:
        outline =3D "###A faulty line of log file: " + e[0] + " with %d" =
%
(loe) + " units" ###"
        print outline
    else:
      print "***   An empty line in the log file!    ***"
      print "*** Ususally this is the end of the log ***"
      badline =3D badline + 1

  return printed

I call the method like this:
<dtml-var
expr=3D"nsparse(_.str(_[name_of_zope_object_stored_in_REQUEST_variable]))=
">
since my script above requires a string.

Hope that helps.
Troy

-----Original Message-----
From: Thomas "M|hlens [mailto:tomeins@yahoo.com]
Sent: Tuesday, February 20, 2001 9:06 AM
To: Erik Enge; Joh Johannsen
Cc: Thomas "M|hlens; zope@zope.org
Subject: Re: [Zope] parsing a textfile line by line


The idea is simple:  How can I access a textfile
(which is a zope object  n o t  a file stored in some
"var" folder on your harddisk) and interpret the data
line by line (preferably not using external methods
but using python scripts which are zope objects, too).
 I don't see any potential security violations (zope's
paranoid sometimes) so it should be possible.

This seems to be a difficult task in zope because,
like Joh said, very simple processes have to be
externalized (e.g. external methods)  Well, once I
start using external processes my code get's
fragmented and scattered over scripts in various
external folders.  My vision, and that's why I'm using
zope and not php, is to keep my code in one place (the
zope object database file data.fs) and object oriented
(it's a bummer to always change code in various
places).

Then again, it might just be my poor zope zen :) or
thin documentation of zope to solve such simple
problems.

--- Erik Enge <erik+list@esol.no> wrote:
> [Joh Johannsen]
>=20
> | Reading a textfile that is a zope object sounds
> like a good start.
> | Anyone know how to do that?
>=20
> I'm not quite sure if I understand.  Would it
> satisfy you to have a
> Zope object with a text property, and read that?=20
> And what do you mean
> by =ABfrom Python=BB?  As in Python Products, Python
> Scripts or Python
> Extensions?


__________________________________________________
Do You Yahoo!?
Get personalized email addresses from Yahoo! Mail - only $35=20
a year!  http://personal.mail.yahoo.com/

_______________________________________________
Zope maillist  -  Zope@zope.org
http://lists.zope.org/mailman/listinfo/zope
**   No cross posts or HTML encoding!  **
(Related lists -=20
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope-dev )