Apache logging of file downloads
- From: ernestm@xxxxxxxxxxxxxx
- Date: 26 Jul 2006 07:47:04 -0700
We have an FTP site with Apache on top of it so that people can choose
whether to download files via HTTP or FTP. But we have a good deal of
difficulty extracting from the standard Apache logs some simple metric
information (how many downloads, and what % complete were partial
downloads) because of a couple factors - 1) download accelerators and
2) interrupted/resumed downloads. Each of these cause, instead of one
log line with the total bytes = the file size, a number of log lines
with smaller byte numbers.
However, you can't just (well, we do but it's not logically correct)
throw a Perl preprocessor on your Apache logs to sum up bytes and see
if it's about equal to one full download, because of cases like a) a
user gets 99% of the download then it fails, or b) a user tries to
downlaod twice, getting about half the file, and fails each time.
To illustrate the problem, here's the Apache log snippet for one
download of a 455 MB file performed using a popular download
accelerator (Internet Download Manager 5.03) to a load balanced Web
server cluster.
server 1:
10.0.7.200 - - [25/Jul/2006:15:49:24 -0500] "GET
/evaluation/labview/pc/labview_80.exe HTTP/1.1" 206 119025777
"http://ftp.ni.com/evaluation/labview/pc/" "Mozilla/4.0 (compatible;
MSIE 6.0; Windows NT 5.1)"
10.0.7.200 - - [25/Jul/2006:15:49:24 -0500] "GET
/evaluation/labview/pc/labview_80.exe HTTP/1.1" 206 119278348
"http://ftp.ni.com/evaluation/labview/pc/" "Mozilla/4.0 (compatible;
MSIE 6.0; Windows NT 5.1)"
10.0.7.200 - - [25/Jul/2006:15:50:10 -0500] "GET
/evaluation/labview/pc/labview_80.exe HTTP/1.1" 206 1382907
"http://ftp.ni.com/evaluation/labview/pc/" "Mozilla/4.0 (compatible;
MSIE 6.0; Windows NT 5.1)"
10.0.7.200 - - [25/Jul/2006:15:50:10 -0500] "GET
/evaluation/labview/pc/labview_80.exe HTTP/1.1" 206 1573404
"http://ftp.ni.com/evaluation/labview/pc/" "Mozilla/4.0 (compatible;
MSIE 6.0; Windows NT 5.1)"
10.0.7.200 - - [25/Jul/2006:15:50:11 -0500] "GET
/evaluation/labview/pc/labview_80.exe HTTP/1.1" 206 424763
"http://ftp.ni.com/evaluation/labview/pc/" "Mozilla/4.0 (compatible;
MSIE 6.0; Windows NT 5.1)"
10.0.7.200 - - [25/Jul/2006:15:50:11 -0500] "GET
/evaluation/labview/pc/labview_80.exe HTTP/1.1" 206 28500
"http://ftp.ni.com/evaluation/labview/pc/" "Mozilla/4.0 (compatible;
MSIE 6.0; Windows NT 5.1)"
10.0.7.200 - - [25/Jul/2006:15:50:11 -0500] "GET
/evaluation/labview/pc/labview_80.exe HTTP/1.1" 206 14251
"http://ftp.ni.com/evaluation/labview/pc/" "Mozilla/4.0 (compatible;
MSIE 6.0; Windows NT 5.1)"
server 2:
10.0.7.200 - - [25/Jul/2006:15:50:09 -0500] "GET
/evaluation/labview/pc/labview_80.exe HTTP/1.1" 206 116434744
"http://ftp.ni.com/evaluation/labview/pc/" "Mozilla/4.0 (compatible;
MSIE 6.0; Windows NT 5.1)"
10.0.7.200 - - [25/Jul/2006:15:50:11 -0500] "GET
/evaluation/labview/pc/labview_80.exe HTTP/1.1" 206 1163264
"http://ftp.ni.com/evaluation/labview/pc/" "Mozilla/4.0 (compatible;
MSIE 6.0; Windows NT 5.1)"
10.0.7.200 - - [25/Jul/2006:15:50:11 -0500] "GET
/evaluation/labview/pc/labview_80.exe HTTP/1.1" 206 394218
"http://ftp.ni.com/evaluation/labview/pc/" "Mozilla/4.0 (compatible;
MSIE 6.0; Windows NT 5.1)"
10.0.7.200 - - [25/Jul/2006:15:50:11 -0500] "GET
/evaluation/labview/pc/labview_80.exe HTTP/1.1" 206 117269584
"http://ftp.ni.com/evaluation/labview/pc/" "Mozilla/4.0 (compatible;
MSIE 6.0; Windows NT 5.1)"
So when you run all this through all existing Web analytics software,
they say "11 downloads!" or if they're really smart, "11 partial
downloads!" when in fact it's one successful download.
This is a big problem for software vendors, as number of eval downloads
and activation/registration rates are business drivers.
I'm looking for ideas on how to fix this problem, ideally in the Apache
log itself. Thoughts?
Thanks,
Ernest Mueller
.
- Prev by Date: mod rewrite problem
- Next by Date: Re: mod rewrite problem
- Previous by thread: mod rewrite problem
- Next by thread: Apache reverse proxy Authorization header - alternative?
- Index(es):
Relevant Pages
|