Re: Stripping HTML attributes and tags



JJ Harrison wrote:

> On Mon, 28 Nov 2005 09:58:34 +1100, dorayme <dorayme@xxxxxxxxxxxxxxx>
> wrote:
>
>>> From: JJ Harrison <noodle_snacks@xxxxxxxxxxxxxx>
>>>
>>> Are there any asp scripts or programs availible that will strip
>>> selected html tags and remove attributes (such as style info etc)
>>> automatically availible?
>>
>>
>>I use BBEdit text editor and the grep pattern search and replace
>>function (or indeed often the regular search and replace), will
>>work for all files in any selected folder...). Powerful stuff,
>>handle with care.
>
> I am not very confident with regular expressions, can anyone suggest a
> good guide or some expressions that would remove all tags except
> <p>,<br>,<ul>,<li>,<b>,<em>,<i>,<strong> and remove all remaining
> attriubtes from the existing tags?

I could, but this is easier:

http://uk2.php.net/strip_tags

--
Jim
.



Relevant Pages

  • Re: Stripping HTML attributes and tags
    ... >> selected html tags and remove attributes ... >function (or indeed often the regular search and replace), ... I am not very confident with regular expressions, ...
    (alt.html)
  • Re: Regular Expression Question
    ... From what i can see he wants the inner part of tags between the '>' and '<' as well as the image source location from his example. ... All of which is easy to do using substring and indexof. ... I'm not saying it's not important - I'm saying that acceptable performance is something that can usually be reached easily these days while relying on the convenience functionality provided to us by advanced programming features like regular expressions or indeed managed code. ...
    (microsoft.public.dotnet.languages.csharp)
  • Re: More RegEx Questions
    ... OK, I love regular expressions, so I fiddled with this ... This will capture ANY tag. ... this does not address the problem of nested tags. ...
    (microsoft.public.dotnet.languages.csharp)
  • Re: Parsing / processing a stream of HTML
    ... > different patterns including text in-between table tags. ... I'd rather use a real parser such as the Chris Lovett's SGML parser. ...
    (microsoft.public.dotnet.languages.csharp)
  • Re: Stripping HTML attributes and tags
    ... > I am not very confident with regular expressions, ... > good guide or some expressions that would remove all tags except ...
    (alt.html)