A few months ago, I blogged about our foray into building a new, improved Badware URL Clearinghouse. At the time, we were starting a three month pilot project. That pilot has since concluded, and I'm back to share what we learned and accomplished during that time.
On the technical side, our developer, Matthew, built a production-ready platform to store badware URLs and associated data. He stuck with his original plan to use MongoDB and Java, and it seems to have worked well. He had to perform some multi-thread magic to efficiently resolve large numbers of domain names efficiently within Java, but he pulled it off. We look forward to migrating the data we currently collect from our data providers and our review process onto the new platform in the coming months.
I'm an executive, not a developer, so for me, the more interesting part of the pilot was talking with current and potential Partners about their interest in data sharing. Nearly every company we talked to craves data, whether to help clean up their own environments (in the case of hosting providers and registrars, for example) or to better protect their customers (in the case of security vendors). But would they be willing to share data to get data? We heard several reservations about sharing data:
- Revealing proprietary methods or information.
- Losing competitive advantage.
- Violating legal restrictions on sharing.
- Helping freeloaders.
- Giving away data that could be marketable.
- Exposing themselves to liability or negative PR.
Despite these concerns, though, several Partners are still interested. Why? Well, the aforementioned demand for data is one reason. Another is the opportunity to help shape a new effort with great potential for helping the Web: by sharing data, our Partners will help each other protect users and help StopBadware to report on badware trends and facilitate cleanup efforts. Some Partners also recognized that a data sharing program is a vehicle for demonstrating their expertise to, and learning from, industry peers.
In that spirit, we're putting together plans to try out a data sharing program with a handful of our Partners. We'll use the new platform that Matthew built, and Partners will be required to contribute substantive data of their own if they want to see others' data. Eventually, we plan to build an API and a Web interface, though we'll likely start with a much more basic daily data feed. Meanwhile, we'll continue looking for opportunities to learn from, and perhaps even combine efforts with, other data sharing initiatives already underway.