home Mail List
Info
Info
Meetings
Goals
Upcoming
Projects
FAQ
Security
Links

[Date Prev][Date Next] [Chronological] [Thread] [Top]

[NMLUG] Python tight loop causing massive CPU barfage



I used the following python code to open 175,000 mail 
files, read the first 1024 lines from each file and search 
for a line beginning with "Subject" from each.

It took 2 minutes 24 seconds on a 800MHz Pentium III 
running SuSE 9.0:

real    2m24.097s
user    2m10.270s
sys     0m9.530s


Here's the read/search loop for each file:

     myre = re.compile('^Subject')

     for fn in filelist:

        mode = os.stat(fn)[stat.ST_MODE]

        if stat.S_ISREG(mode):

            mf_fd = os.open( fn, os.O_RDONLY )
            the_file = os.fdopen(mf_fd, "r", 1024)
            the_lines = the_file.readlines(1024)
            for line in the_lines:
                m = myre.match(line)
            the_file.close()



On Tue, 08 Feb 2005 20:44:17 -0700
  Paul Tietjens <paul.tietjens@moriarty.k12.nm.us> wrote:
> I have a python script that essentially opens a few 
>thousand (between 70,000 and 230,000 or so) files, reads 
>the first 1024 bytes and looks for a string match.
> 
> The goal is to search an entire partition full of 
>Maildirs for specific emails.
> 
> I want the process to happen as fast as possible.  So 
>far, it takes around 21 minutes - but there's a snag. 
> While this script is running, every other process on the 
>machine becomes sluggish to the point of 
>nonresponsiveness.
> 
> No amount of playing with nice and priority levels seems 
>to help.
> 
> What has helped, is a small sleep() in the loop - but 
>that raises the amount of time taken to complete the 
>tasks fairly rapidly (from 21 minutes to over an hour).
> 
> In the end, I set up a goofy sort of throttling that 
>alters the amount of time sleep()ing by the average load.
> 
> Is there a better way to do this?  I'm not much of a 
>coder, and I know there are a couple on this list - so 
>any tips offered, no matter how nebulous, would be great.
> 
> Thanks in advance!
> _______________________________________________
> NMLUG mailing list
> NMLUG@nmlug.org
> http://www.nmlug.org/mailman/listinfo/nmlug



Please send sugestions and comments to webmaster@nmlug.org.
Valid XHTML 1.1! Valid CSS! Powered by Debian Powered by Apache