End Point

News

Welcome to End Point's blog

Ongoing observations by End Point people.

Announcing Ruby gem: email_verifier

How many times have you tried to provide a really nice validation solution for our fields containing user emails? Most of the time - the best we can come up with is some long and incomprehensible Regex we find on StackOverflow or somewhere else on the Internet.

But that's really only a partial solution. As much as email format correctness is a tricky thing to get right using regular expressions, it doesn't provide us with any assurance that user entered email address in reality exists.

But it does a great job at finding out some typos and misspellings.. right?

Yes - but I'd argue that it doesn't cover full range of that kind of data entry errors. The user could fill in 'whatever' and traditional validation through regexes would do a great job at finding out that it's not really an email address. But what I'm concerned with here are all those situations when I fat finger kaml@endpoint.com instead of kamil@endpoint.com.

Some would argue at this point that it's still recoverable since I can find out about the error on the next page in a submission workflow, but I don't want to spend another something-minutes on going through the whole process again (possibly filling out tens of form fields along the way).

And look at this issue from the point of view of a web application owner: You'd like to be sure that all those leads you have in your database point to some real people and that some percentage of them will end up paying you at some point real money, making you a living. What if even 10% of email addresses invalid (being valid email addresses but pointing to no real mailboxes) due to user error? What would that potentially mean to you in cash?

The Solution

Recently, I faced this email validation question for mobixa.com. (By the way. if you own a smart phone that you'd like to sell - there is no better place than mobixa.com to do it!)

The results of my work, I'd like to announce here and now. Please give a warm welcome to a newborn citizen of RubyGems society: email_verifier

How does it work?

Email verifier takes a different approach to email validation. Instead of checking just the format of given address in question - it actually tries to connect with a mail server and pretends to send a real mail message. We can call it 'asking mail server if recipient exists'.

How to use it?

Add this line to your application's Gemfile:

gem 'email_verifier'

And then execute:

$ bundle

Or install it yourself as:

$ gem install email_verifier

Some SMTP servers will not allow you to check if you will not present yourself as some real user. So first thing you'd need to set up is to put something like this either in initializer or in application.rb file:

EmailVerifier.config do |config|
  config.verifier_email = "realname@realdomain.com"
end

Then add this to your model:

validates_email_realness_of :email

Or - if you'd like to use it outside of your models:

EmailValidator.check(youremail)

This method will return true or false, or - will throw exception with nicely detailed info about what's wrong.

Read More about the extension at Email verifier RDoc or try to sell your smartphone back here at Mobixa.com.

7 comments:

Anonymous said...

We can call it 'asking mail server if recipient exists'.

This is correctly called "completely broken"
- The mailbox may exist and accept mail but no human being may read it
- The tool to actually do this is called "VRFY" in SMTP land but this is disabled for security and spam reasons most often
- The mailbox may exist but belong to the wrong person.
- The mailbox may not exist, but the SMTP server may accept mail unconditionally and later send a bounce

The only way to correctly verify emails is to actually send an email with a link of the email address+HMAC

Kamil Ciemniewski said...

Anonymous - thanks for feedback.

I'd like to discuss the issues you brought up:

"The mailbox may exist and accept mail but no human being may read it"

True - but we can't defend ourselves from all the failed use cases but we can still minify some of dangers.

"The tool to actually do this is called "VRFY" in SMTP land but this is disabled for security and spam reasons most often"

And that's exactly the reason why we're not using it.

"The mailbox may exist but belong to the wrong person."
"The mailbox may not exist, but the SMTP server may accept mail unconditionally and later send a bounce"

True and true again, but it's still better to accept some percentage of useless addresses than to collect all of them.

Kamil

Jon Jensen said...

This conversation would actually be good to edit and put in the documentation for the gem. There's format validation (which Kamil described in the article), then this attempt to start sending mail, then full click-through verification with a token as anonymous describes. All useful in their place, and probably should explain exactly how this gem works among the various options out there.

Thanks for releasing this gem to the public, Kamil!

Eitan Adler said...

Another concern: if the system administrator disabled VRFY wouldn't it be a safe assumption that she doesn't want you enumerating or verifying email addresses in this manner?

How do you deal with temporary failures? Do you resend the callout again in the future?

What is the of this module? To confirm an email belongs to the user as she typed it? It doesn't help?

To report in real time about potentially wrong email addresses on form submission? Wouldn't the latency be absurd?

I could only see this being useful to clean a dirty email list before sending out confirmed-opt-in letters. In this case though the volume of probes may very well be considered abusive.

This post inspired me to write a little rant about how to verify email addresses:
http://blog.eitanadler.com/2012/12/correctly-verifying-email-address.html

Steph Skardal said...

A couple of answers:

>> How do you deal with temporary failures? Do you resend the callout again in the future?

Temporarily failures are ignored, i.e. in the case of an exception for any reason, the email validation is skipped and users have a frictionless way of continuing the sign-up process. Obviously, this allows a few bad emails through the system.

>> What is the of this module? To confirm an email belongs to the user as she typed it? It doesn't help?

Immediately after an email is entered (JS onchange event), this module validates the realness of it. In case the email is invalid, the user cannot proceed to the next sign-up step.

>>To report in real time about potentially wrong email addresses on form submission? Wouldn't the latency be absurd?

Latency is less of an issue than any email validation services that were investigated for this business need, which claim to have 4-7 seconds of latency. So at this point, no, latency is not an issue and has served the business need well.

So far, the known pros / cons of this approach have served the business need for validation emails well.

Anonymous said...

Is it possible that their server start blocking your email in the future? (this is mentioned in a similar gem: https://github.com/pash/email_veracity_checker)

Kamil Ciemniewski said...

Anonymous:

While it's true that it's possible - you have to ask yourself a question: will my app use it hundreds of thousands of times per day? Will you have millions of users which will have to be checked everyday? If you'll e. g. only check a handful of users/emails everyday, then it's very unlikely that you'll get blacklisted by any SMTP server..

While I agree with what theoretically experts are saying about this method, I reckon that the reality check is worth much more than the most elaborate theory.

On the other hand - theories doesn't come from nowhere. You just have to gauge your situation properly.