WANTED: class for parsing rfc (x)822 addresses

View: New views
7 Messages — Rating Filter:   Alert me  

WANTED: class for parsing rfc (x)822 addresses

by Bill Welliver :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

I'm in the process of writing a list server application as I find mailman
to be entirely too heavy. It occurred to me that in the course of this
effort, I'll need to manipulate email addresses. Rather than duplicate
work, does anyone have a class that can convert a string into an object
that represents the email address; understanding all of the various
formats that may or may not include real names? Anything you might have
that you'd be willing to share would be most appreciated!

Bill


Re: WANTED: class for parsing rfc (x)822 addresses

by Henrik Grubbström-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Thu, 17 Jul 2008, Bill Welliver wrote:

> I'm in the process of writing a list server application as I find mailman to
> be entirely too heavy. It occurred to me that in the course of this effort,
> I'll need to manipulate email addresses. Rather than duplicate work, does
> anyone have a class that can convert a string into an object that represents
> the email address; understanding all of the various formats that may or may
> not include real names? Anything you might have that you'd be willing to
> share would be most appreciated!

Take a look at MIME.decode_words_tokenized_remapped() et al.

> Bill

--
Henrik Grubbström grubba@...
Roxen Internet Software AB

Parent Message unknown Re: WANTED: class for parsing rfc (x)822 addresses

by Henrik Grubbström-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Thu, 17 Jul 2008, Bill Welliver wrote:

> Thanks for the tip. I've taken that and turned it into the attached class.
>
> To anyone interested, please feel free to have a look and see if you can get
> it to break (I'm not terribly imaginative but have tried a few test cases).
[...]

From a quick look, it doesn't seem to support old-style addresses. eg:

   grubba@... (Grubba)

which were common in ~1993 and earlier.

> It should spit out obviously incorrect addresses (based on my quick read of
> the RFCs). It's amazing how lenient the specs are... this class should accept
> a lot of emails that most mta would reject.

True.

> Bill
>
> On Thu, 17 Jul 2008, Henrik Grubbström wrote:
>
>> Take a look at MIME.decode_words_tokenized_remapped() et al.

--
Henrik Grubbström grubba@...
Roxen Internet Software AB

Parent Message unknown Re: WANTED: class for parsing rfc (x)822 addresses

by Stephen R. van den Berg :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Bill Welliver wrote:
>Yes, I assume you're referring to the fact that the sections within
>parentheses is dropped. Apparently that's correct according to the spec,
>as (X) is a "comment".

>More importantly, the mime module doesn't pass these on during
>tokenization. Perhaps that could be considered a bug? I'd think it should
>at least be included in the output stream marked as a comment token.

At first glance, I'd say you're right, or there should, at least, be a
way to get the comment-tokens..
--
Sincerely,
           Stephen R. van den Berg.
"It has been said that the only standard thing about all UNIX systems is the
 message-of-the-day telling users to clean up their files." -- SysV.2 manual


Re: WANTED: class for parsing rfc (x)822 addresses

by Henrik Grubbström-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Sat, 19 Jul 2008, Stephen R. van den Berg wrote:

> Bill Welliver wrote:
>> Yes, I assume you're referring to the fact that the sections within
>> parentheses is dropped. Apparently that's correct according to the spec,
>> as (X) is a "comment".
>
>> More importantly, the mime module doesn't pass these on during
>> tokenization. Perhaps that could be considered a bug? I'd think it should
>> at least be included in the output stream marked as a comment token.
>
> At first glance, I'd say you're right, or there should, at least, be a
> way to get the comment-tokens..
In that case, use MIME.decode_words_tokenized_labled_remapped() instead.

My original issue was that I didn't see that Bill's code supported
addresses that weren't inside '<' and '>'.

> --
> Sincerely,
>           Stephen R. van den Berg.

--
Henrik Grubbström grubba@...
Roxen Internet Software AB

Re: WANTED: class for parsing rfc (x)822 addresses

by Bill Welliver :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

> In that case, use MIME.decode_words_tokenized_labled_remapped() instead.
>
> My original issue was that I didn't see that Bill's code supported addresses
> that weren't inside '<' and '>'.
>

Ah, it's possible I caught that case just after I sent the original email,
as my current version seems to do just fine with yours.

Not sure, though, what to do about names in comments... it's just as
likely that the comment could be something other than a name. Any
thoughts?

Bill


Re: WANTED: class for parsing rfc (x)822 addresses

by Martin Bähr :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Tue, Jul 22, 2008 at 09:56:32AM -0400, Bill Welliver wrote:
> Not sure, though, what to do about names in comments... it's just as
> likely that the comment could be something other than a name. Any
> thoughts?

then it would still be part of the email address.

i see
"Pike spikes and roxen rocks" <pike@...>
and
pike@... (Pike spikes and roxen rocks)
as equivalent.

i am not aware that the comment is supposed to be a name. how would you
verify that anyways? it's just a string. at best it may not contain an @
(or " or (), depending on the variant) or some other character limitations,
but what else except storing it somewhere do you want to do with it?

greetings, martin.
--
cooperative communication with sTeam      -     caudium, pike, roxen and unix
offering: programming, training and administration   -  anywhere in the world
--
pike programmer   working in china                      community.gotpike.org
unix system-      iaeste.(tuwien.ac|or).at                     open-steam.org
administrator     caudium.org                                    is.schon.org
Martin Bähr       http://www.iaeste.or.at/~mbaehr/

LightInTheBox - Buy quality products at wholesale price