Re: Regex Help
- From: Ezra Zygmuntowicz <ezra@xxxxxxxxxxxxxxxxx>
- Date: Tue, 23 Aug 2005 07:27:46 +0900
David-
Thanks, the regex you posted works great. I had considered just trimming the text inside the tags and then untrimming until a word end, but I figured there would be a regex that would do it all at once.
Thanks Dave- Ezra
On Aug 22, 2005, at 2:53 PM, David A. Black wrote:
Hi --
On Tue, 23 Aug 2005, John Halderman wrote:
Seems to me that you're trying to do too much with one regular expression. I
would just grab the content between your tags and then trim that down to 50
characters and reassemble it afterwards.
I'm not sure what you mean by "too much". I think the substitution I suggested does what Ezra said he needed. Is there an error in it?
David
-j
On 8/22/05, David A. Black <dblack@xxxxxxxxxxxx> wrote:
Hi --
On Tue, 23 Aug 2005, Ezra Zygmuntowicz wrote:
Hey Guys-
I have a regex problem that I am not sure how to tackle. I am parsing
some classified ads in order to format them for display online. I have
most
of the parsing done but I need help with the final step. So the file has
one
ad per line and a line looks like this:
<ftditm><begad:11559303>Selah Country Home 1.5 acres. 3 bdrm, 2 bath,
irrigation, horse barn. $122,000. 509-697-6519<endad>
Now I have already parsed everything to get it to this state but what I
need to do next is to count 50 chars after the <begad:11559303> tag and
insert </ftditm>
But the tricky part is that I need to place the </ftditm> 50 characters
in to
the line but if the 50 chars ends in the middle of a word then I need to
match the rest of the word as well. So I need a way to match at least 50
chars plus the rest of the current word if the 50'th char lands in the
middle
of a word.
So for this particular ad 50 chars makes it to here:
<ftditm><begad:11559303>Selah Country Home 1.5 acres. 3 bdrm, 2 bath,
irri
#<= 50 chars ends here# gation, horse barn. $122,000.
509-697-6519<endad>
So it ends in the middle of the word irrigation and I need it to consume
the whole word.
Here's one idea:
str.sub(/(<begad:[^>]+>.{1,50}.*?\b)/, "\\1<\/ftditm>")
David -- David A. Black dblack@xxxxxxxxxxxx
-- David A. Black dblack@xxxxxxxxxxxx
-Ezra Zygmuntowicz Yakima Herald-Republic WebMaster 509-577-7732 ezra@xxxxxxxxxxxxxxxxx
.
- References:
- Regex Help
- From: Ezra Zygmuntowicz
- Re: Regex Help
- From: David A. Black
- Re: Regex Help
- From: John Halderman
- Re: Regex Help
- From: David A. Black
- Regex Help
- Prev by Date: Re: No Keys, nor other hash methods on multidimensional hash
- Next by Date: Re: No Keys, nor other hash methods on multidimensional hash
- Previous by thread: Re: Regex Help
- Next by thread: security riddle with $SAFE and untainted strings
- Index(es):
Relevant Pages
|