A while ago I wrote about spam, OpenID and Mollom. I mentioned that at Mollom we are maintaining an internal reputation for OpenID identifier that we encounter while assessing submitted content. In addition to that, we could also asses the reputation of the OpenID identify provider (IdP), which is useful in its own right.

It is still early days, but I believe that any identify system (i.e. OpenID) needs a reputation component (i.e. Mollom). I also believe that any reputation system needs to be able to establish an identity first.

Mollom's reputation system tries to predict future behavior by looking at past behavior. To do this, Mollom keeps track of past behavior, and updates the behavior as it receives more data about the user. At the same time, Mollom forgets. In other words, one must consistently behave well to maintain a good reputation.

At Mollom, we plan to open up our reputation system through an API (i.e. mollom.getReputationIdentity('http://john.myopenid.com')) or mollom.getReputationProvider('http://myopenid.com')), but before we do, I'd like to solicit some feedback and invite people to participate.

Nothing drives API design better than concrete use cases and experiments in the field. How would you like to consume reputation information? What format should we use so reputation information from different sources can be combined? How can your application help us build and maintain reputation profiles? What Drupal modules can take advantage of this? What Drupal modules can help build reputation? How do we prevent abuse?

I don't expect that we can answer these questions overnight, but if you want to collaborate or prototype a few ideas, let us know. Should be fun!


John (not verified):

Does Mollom care about privacy?

Mollom doesn't announce itself anywhere on participating sites, and doesn't provide a mechanism to opt out when posting.

We can assume thousands of users are sending information to Mollom without knowing it, including minors.

Before you consider sharing this information with third parties via an API, you should be very concerned with:

1. Whether that information was collected legally in the first place.

2. Whether you have received consent from the user to share it.

Be aware of American COPPA laws:

"The Children's Online Privacy Protection Act and Rule apply to individually identifiable information about a child that is collected online, such as full name, home address, email address, telephone number or any other information that would allow someone to identify or contact the child. The Act and Rule also cover other types of information -- for example, hobbies, interests and information collected through cookies or other types of tracking mechanisms -- when they are tied to individually identifiable information." - www.coppa.org

Be aware of Canadian PIPEDA laws:

"As a general rule, the Personal Information Protection and Electronic Documents Act provides that organizations can only collect, use or disclose information for purposes for which you have given consent." - www.privcom.gc.ca

Does installing Mollom put site owners at risk of violating these laws?

sirkitree (not verified):

As long as Mollom itself does not capture any sort of age information, they should be shielded from COPPA laws. But then I'm not lawyer, I only know from experience that as long as we kept the 'year' information out, we were okay to collect. Giving that information out, not so sure, but it may be a concern. So I would like to know exactly what information Mollom _is_ collecting.

That being said, a 'reputation' engine would be pretty cool. We're experimenting with a 'recommendation' engine right now (for music related needs) and the algorithms that determine either would probably be pretty similar I would imagine. Check out http://thefilter.com - I have some contacts there that I might be able to put you in touch with as well, Dries, if you're interested.

fintan (not verified):

The point at which data becomes personally identifiable is the point at where data protection laws kick in. If you're keeping information relating to someone based on an ID then its clearly identifiable and as such falls under the protection of these and numerous other laws.

The point about putting sites that use Mollom in a legal quagmire is also very valid. I have installed it on a couple of our sites and planned on putting in numerous others but after seeing this post I feel that we may well have to uninstall it which is a real pity as it seems to work really well.

Matt Farina (not verified):

To add to this there are a number of sites that use Mollom and have privacy policies that say the information they collect is shared with 3rd party companies to do stuff for their site and that their personal information won't be shared with others. This API system would change that.

I know these sites can't really control that (unless they negotiated some deal) but they may have done it in good faith.

While I like the idea of a reputation system and don't see many people in the main stream public thinking of this I can still see privacy issues that may need to get hashed out.


Note that we are not redistributing data that could identify people or that is not otherwise public.

First, we're not capturing age information, address information, etc. We're capturing OpenID identifiers, which are public to begin with, and (typically) anonymous too.

Second, we'd only be returning a computed reputation score (i.e. a number between 1 and 10) for a given OpenID identifier. This isn't any different from the ratings (i.e. 3 gold stars) that people get on many discussion fora. We're the owner of that score, so to speak.

Third, this is what OpenID 2.0, and OpenID's attribute exchange specifically, is all about.

In other words, I don't see how we could or would provide access to information that one can use to identify a person.

Either way, Mollom is not in the business of violating laws, nor do we have a desire to put individuals at risk. If we decide to proceed with this, we'll make sure to respect people's privacy, and any privacy law that is applicable to us as a company.

All that said, I'm curious to know how applications like Facebook and Google's Open Social deal with this. Unlike Mollom, they are actually sharing a ton of information.

Freso (not verified):

But John didn't just ask whether Mollom with reputation service would make a site conflict with the Law, but whether just installing Mollom as-is would put that site in conflict with it.

You do currently say that you're capturing a website visitor's name or nickname, IP address, membership ID, OpenID, website URL and e-mail address as well as storing them - some of these might be considered data that could identify people or that is not otherwise public.


Mollom's operations comply with privacy law in the European Union. Whether it complies with privacy law in countries that are not member states of the European Union, I don't know for sure. On https://www.mollom.com/terms-of-service we provide the necessary details and transparency that could help you answer that question.

However, I would be shocked if we didn't comply. Virtually any website captures IP addresses, e-mail addresses, etc. If I were to post a comment on freso.dk, you're also capturing information that could identify me. If you run Google Analytics on your site, Google is also capturing information about me through your website.

Freso (not verified):

If you left a comment on freso.dk right now, I'd be rather surprised, and hope you would let my host know how you cracked them. :) (Posting comments on static text/html pages is quite a feat, IMHO. ;))

Anyway, I agree that it looks and sounds like everything's in perfect order, but it never hurts to ask - and asking can sometimes lead to less hurting in the long run! And, also, one thing is a site capturing information for itself, another thing is sending that information to a 3rd party. (And I also think the Danish data([/| ]privacy) laws might be a bit more strict than the general EU laws... or the EU might have put an end to that as well. :/)


I agree. Please let me know what you learned, and how we could improve our policies.

Freso (not verified):

I've wanted to ask the Danish authorities, Datatilsynet, about this for a while. Inspired by your comment, I've (just) now sent an e-mail in their direction asking for how they see this. I'll notify the community when I hear back from them. :)

pwolanin (not verified):

From the above comments, it seems that at the least you should be very cautious about having the API open to query individual Open IDs - I can think of a number of problematic uses (e.g. query 1000s to find the existing/valid IDs at a provider).

It's hard to see a way around - if you (for example) only returned reputation data when a comment is submitted, could you reasonably guard against sites that fake comment submission?

For that matter, what if I want to inconvenience someone by making it such that they always have to answer a CAPTCHA? Could I have my site submit dozens of fake Viagra-spam comments every day reporting their Open ID in order to drag down their reputation with Mollom?