[mew-int 2933] Re: Identify urls
Christophe TROESTLER
Christophe.Troestler at example.com
Sat Nov 6 01:01:34 JST 2010
On Fri, 5 Nov 2010 13:07:30 -0200, Diogo F.S.Ramos wrote:
>
> > I am not sure allowing closing braces inside URLs is the way to go.
> > Sure some URLs contain braces but these are usually balanced. I
> > personally use the following regex to allow "depth 1" matching braces.
>
> You have a valid point, although I don't see the disadvantage of
> allowing closing braces inside URLs, but I guess it should correctly
> identify URLs like `(http://www.example.com/foo(bar))baz' as
> `http://www.example.com/foo(bar)'.
The problem with “(http://www.example.com/foo(bar))baz” is that it is
ambiguous. Did the user want to say that “http://www.example.com/foo(bar)”
is the URL and forgot the space after the closing brace? The only
case of URLs with braces I have seen are from Microsoft (MSDN) and
these have matching braces. I have never seen URLs with two braces
like “http://www.example.com/foo(bar)(zzz)” or like
“http://www.example.com/foo(b(a)r)”.
> Unfortunately I tried your solution with `re-builder' but it always
> stop recognizing the URL at a closing parentheses. Could you verify if
> it is working there for you?
You are correct; here is a better (although not yet perfect) version:
(setq mew-regex-url
(let ((u "[^ \n>()\"`'“”]*"))
(concat "\\b\\(\\(\\(file\\|news\\|mailto\\):\\)"
"\\|\\(\\(s?https?\\|ftp\\|gopher\\|telnet\\|wais\\)://\\)\\)"
"\\((" u ")\\|" u "[^ \n>()\"`'“”.,:]\\)+")))
Best,
C.
More information about the Mew-int
mailing list