Re: More on BEGINFILE / ENDFILE
- From: arnold@xxxxxxxxxx (Aharon Robbins)
- Date: Tue, 3 Feb 2009 19:12:23 +0000 (UTC)
In article <glvuuu$ghk$1@xxxxxxxxxxxxxxxxxx>,
Manuel Collado <m.collado@xxxxxxxxxxxxxx> wrote:
It's not clear what you're asking. A read error has two possible outcomes
in gawk:
1. If reading via getline from a command-line file, an error is returned.
2. If reading via the main input loop, the error is fatal.
You seem to be suggesting that in case 2, the error not be fatal, but
instead go into the ENDFILE block with ERRNO set.
Yes. This is exactly what I was suggesting.
OK. The diff below should do this. It is relative to the BEGINFILE
patch. I will be updating that patch on http://www.skeeve.com shortly.
Anyway, in practice, it is hard to have a case where a file is readable
part way through the processing and then suddenly becomes unreadable.
Well, if AWK had true unicode support sometime in the future, we can
have errors in the middle of a UTF-8 file, like "invalid byte sequence".
I don't even want to think about this. This is the job iconv is meant to do.
Do these errors show up as a result of calls to read? Are they currently
fatal errors?
In xgawk, reading input in XML mode is handled by feeding input text
chunks to the expat parser which in turn delivers "records" in the form
of XML SAX events. So non wellformed XML files generate faulty "records"
in the middle of the file, and they are considered non-fatal. The
current action is to set ERRNO and automatically ignore the rest of the
file and proceed to the next input file.
This means that the error notification is mixed with the next valid
record, when FILENAME and other special values related to the faulty
record have been updated to refer to the current new record (no problem
if the faulty file is the last one).
We could create a special XMLERROR event to notify non-fatal errors at
the same level of normal records, but the addition of the ENDFILE
feature opens another possibility for reporting non-fatal errors and
automatically continue processing the next input file. And this new
possibility can also be used to report errors of regular text files, and
not only XML ones.
I think the diff below gives you what you want, as long as your version
of get_a_record puts an appropriate value into the *errcode variable.
Manuel Collado - http://lml.ls.fi.upm.es/~mcollado
Thanks again for the feedback.
Arnold
----------------------------------------------------------------------------------
--- io.c.save 2008-12-25 08:59:30.000000000 +0200
+++ io.c 2009-02-03 22:29:31.000000000 +0200
@@ -353,7 +353,8 @@
fname = arg->stptr;
errno = 0;
curfile = iop_open(fname, binmode("r"), &mybuf, & isdir, FALSE);
- update_ERRNO();
+ if (! do_traditional)
+ update_ERRNO();
/* This is a kludge. */
unref(FILENAME_node->var_value);
@@ -442,17 +443,25 @@
char *begin;
register int cnt;
int retval = 0;
+ int errcode = 0;
if (at_eof(iop) && no_data_left(iop))
cnt = EOF;
else if ((iop->flag & IOP_CLOSED) != 0)
cnt = EOF;
else
- cnt = get_a_record(&begin, iop, NULL);
+ cnt = get_a_record(&begin, iop, & errcode);
if (cnt == EOF) {
cnt = 0;
retval = 1;
+ if (errcode > 0) {
+ if (do_traditional)
+ fatal(_("error reading input file `%s': %s"),
+ iop->name, strerror(errcode));
+ else
+ update_ERRNO_saved(errcode);
+ }
} else {
NR += 1;
FNR += 1;
@@ -959,10 +968,12 @@
lintwarn(_("close: `%.*s' is not an open file, pipe or co-process"),
(int) tmp->stlen, tmp->stptr);
- /* update ERRNO manually, using errno = ENOENT is a stretch. */
- cp = _("close of redirection that was never opened");
- unref(ERRNO_node->var_value);
- ERRNO_node->var_value = make_string(cp, strlen(cp));
+ if (! do_traditional) {
+ /* update ERRNO manually, using errno = ENOENT is a stretch. */
+ cp = _("close of redirection that was never opened");
+ unref(ERRNO_node->var_value);
+ ERRNO_node->var_value = make_string(cp, strlen(cp));
+ }
free_temp(tmp);
return tmp_number((AWKNUM) -1.0);
@@ -3037,13 +3048,10 @@
iop->flag |= IOP_AT_EOF;
return EOF;
} else if (iop->count == -1) {
- if (! do_traditional && errcode != NULL) {
+ iop->flag |= IOP_AT_EOF;
+ if (errcode != NULL)
*errcode = errno;
- iop->flag |= IOP_AT_EOF;
- return EOF;
- } else
- fatal(_("error reading input file `%s': %s"),
- iop->name, strerror(errno));
+ return EOF;
} else {
iop->dataend = iop->buf + iop->count;
iop->off = iop->buf;
--
Aharon (Arnold) Robbins arnold AT skeeve DOT com
P.O. Box 354 Home Phone: +972 8 979-0381
Nof Ayalon Cell Phone: +972 50 729-7545
D.N. Shimshon 99785 ISRAEL
.
- Follow-Ups:
- Re: More on BEGINFILE / ENDFILE
- From: Aharon Robbins
- Re: More on BEGINFILE / ENDFILE
- Prev by Date: Re: a wiki about awk
- Next by Date: Re: More on BEGINFILE / ENDFILE
- Previous by thread: Re: a wiki about awk
- Next by thread: Re: More on BEGINFILE / ENDFILE
- Index(es):
Relevant Pages
|