Help on best way to gather/sort results [Array/Hash]?



Greetings ruby fans,

I'm a greenhorn at this cool lang ruby. Much to learn. Perhaps you
chaps could help me with an issue I have. I've read through a number of
the post on sorting Arrays and Hashes. And yet I can't seem to put my
finger on the solution. I want to sort on the second column. So it
seemed from what information I gathered, that I need to gather my
results into a hash. Am I on the right track? Oh, let me tell you what
your looking at here; I am scanning each mail file in our queue for
commonalites (spammer) instead of the useless (my opinoin) qmHandle we
have for qmail. So, I've got a working prototype. If you could help me
on my sort and if you have any other comments/suggestions to throw my
way I'm sure I could learn a thing or two. Being new to ruby, there's a
lot of new ideas here. Thank guys.

Code:
#!/usr/local/bin/ruby -w
require 'find'

@results = Array.new

# Iterate through the child directories & call the parse file method
def scan_dirs
root = "/var/qmail/queue/mess"
Find.find(root) do |file|
parse_file(file)
end
@results.sort!
print_results
end

# Parse each file for the information we want
def parse_file(path)
file = path[(path.length-7), path.length]
sourceip = ""
email = ""
subject = ""
email_found = false
line_no = 0

File.open(path, 'r').each do |line|

line = line.strip # Remove any \n\r nil, etc
line_no += 1

if line_no == 1
if line.match("invoked for bounce")
# Internal Bounce Msg
sourceip = "SMTP"
end
end

if (line_no == 2 and sourceip.empty?)
if line.match("webmail.commspeed.net")
sourceip = "Webmail"
else
sourceip = line.scan(/\b(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})\b/)
if sourceip.empty?
sourceip = "No Source IP**"
end
end
end

if (line.match("SquirrelMail") and sourceip == "Webmail") or
(line.match("From:") and sourceip != "Webmail")
if email.empty?
email = get_email(line)
end
end

if line.match("Subject:") and subject.empty?
subject = truncate(line,50)
end

if line_no == 20 #Nothing more we want to read in the file
@results << ["#{file}", "#{sourceip}", "#{email}", "#{subject}"]
line_no = 0
return
end
end
end

# Truncate subject line
def truncate(string, width)
if string.length <= width
string
else
string[0, width-3] + "..."
end
end

# Print out results
def print_results
print "\e[2J\e[f"

print "Mess#".ljust(10," ")
print "Source".ljust(18," ")
print "Email Addrress".ljust(30, " ")
print "Subject".ljust(50, " ")
1.times { print "\n" }
111.times { print "-" }
1.times { print "\n" }

@results.each do |line|
print line[0].ljust(10," ")
print line[1].ljust(18," ")
print line[2].ljust(30, " ")
print line[3].ljust(50, " ")

1.times { print "\n" }
end
end

# Get email address from line/string
def get_email(line_to_parse)
# Pull the email address from the line
line_to_parse.scan(/\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}\b/i).flatten
end

# Ok, begin our scan
scan_dirs
exit

Partial results listing: (I've modified the content to protect privacy)
Mess# Source Email Addrress Subject
---------------------------------------------------------------------------------------------------------------
3360108 111.111.17.1 hobby@xxxxxxxxxxxxx
3360167 111.111.7.213 hunter@xxxxxxxxxxxxx Subject:
Removed to protect the innocent....
3360186 Webmail fisher@xxxxxxxxxxxxx Subject:
Removed to protect the innocent
3360209 111.111.40.10 curator@xxxxxxxxxxxxxxx
3360215 111.111.15.110 blueprints@xxxxxxxxxxxxx Subject:
Removed to protect the innocent
3360217 111.111.9.248 user1@xxxxxxxxxxxxx Subject:
Removed to protect the innocent
3360226 111.111.11.43 user@xxxxxxxxxxxxx Subject:
Removed to protect the innocent
3360228 111.111.16.34 user@xxxxxxxxxxxxx Subject:
Pictures
3360241 111.111.18.73 joe@xxxxxxxxxxxxx Subject:
Removed to protect the innocent
3360242 111.111.14.109 user@xxxxxxxxxxxxx Subject:
Emailing: maps.htm
--
Posted via http://www.ruby-forum.com/.

.



Relevant Pages

  • Re: If this is the straight story on the NSA database, does anyone still have a problem with it?
    ... anyone still think it's a violation of the Fourth Amendment? ... implicate an innocent by random association ... where mine has a 5 - a very easy mistake ... We Must Protect this Couch! ...
    (rec.sport.football.college)
  • Re: Names have been changed to protect the innocent
    ... "...names have been changed to protect the innocent." ... Origins, references, discussions please. ... I've been wondering lately--is there a name for the literary device of ...
    (alt.usage.english)
  • Re: Strange Error
    ... The time provider NtpClient failed to establish a trust relationship between this computer and the <domain name changed to protect the innocent> domain in order to securely synchronize time. ...
    (microsoft.public.windows.server.general)
  • Re: Huang Na Killing
    ... The law must also protect the innocent and sometimes the innocent are being ... times - sometimes your friends heads especially Chinese heads were hung on ...
    (soc.culture.singapore)
  • Re: Sequel: Cannot save model or limit varchars
    ... # Ahh, protect this Method... ... def execute_insert ... # add RETURNING statement to sql ... Not that I mind, but you mind get a better response on the Sequel ...
    (comp.lang.ruby)