Archive for the ‘privacy’ Category

FTC proposes a do not track privacy mechanism

Wednesday, December 1st, 2010

Today the FTC released a preliminary staff report that proposes a “do not track” mechanism allowing consumers to opt out of data collection on online searching and browsing activities. The FTC report says that industry self-regulation efforts on privacy have been “too slow, and up to now have failed to provide adequate and meaningful protection.”

“To reduce the burden on consumers and ensure basic privacy protections, the report first recommends that “companies should adopt a ‘privacy by design’ approach by building privacy protections into their everyday business practices.” Such protections include reasonable security for consumer data, limited collection and retention of such data, and reasonable procedures to promote data accuracy. … Second, the report states, consumers should be presented with choice about collection and sharing of their data at the time and in the context in which they are making decisions – not after having to read long, complicated disclosures that they often cannot find. … One method of simplified choice the FTC staff recommends is a “Do Not Track” mechanism governing the collection of information about consumer’s Internet activity to deliver targeted advertisements and for other purposes. Consumers and industry both support increased transparency and choice for this largely invisible practice. The Commission recommends a simple, easy to use choice mechanism for consumers to opt out of the collection of information about their Internet behavior for targeted ads. The most practical method would probably involve the placement of a persistent setting, similar to a cookie, on the consumer’s browser signaling the consumer’s choices about being tracked and receiving targeted ads.”

The full text of the 120-page report, Protecting Consumer Privacy in an Era of Rapid Change — a proposed framework ofr businesses and policymakers is available online.

Tim Berners-Lee on protecting the Web in the December Scientific American

Friday, November 19th, 2010

Sir Tim Berners-Lee discusses the principles underlying the Web and the need to protect them in an article from the December issue of Scientific American, Long Live the Web.

“The Web evolved into a powerful, ubiquitous tool because it was built on egalitarian principles and because thousands of individuals, universities and companies have worked, both independently and together as part of the World Wide Web Consortium, to expand its capabilities based on those principles.

The Web as we know it, however, is being threatened in different ways. Some of its most successful inhabitants have begun to chip away at its principles. Large social-networking sites are walling off information posted by their users from the rest of the Web. Wireless Internet providers are being tempted to slow traffic to sites with which they have not made deals. Governments—totalitarian and democratic alike—are monitoring people’s online habits, endangering important human rights.

If we, the Web’s users, allow these and other trends to proceed unchecked, the Web could be broken into fragmented islands. We could lose the freedom to connect with whichever Web sites we want. The ill effects could extend to smartphones and pads, which are also portals to the extensive information that the Web provides.

Why should you care? Because the Web is yours. It is a public resource on which you, your business, your community and your government depend. The Web is also vital to democracy, a communications channel that makes possible a continuous worldwide conversation. The Web is now more critical to free speech than any other medium. It brings principles established in the U.S. Constitution, the British Magna Carta and other important documents into the network age: freedom from being snooped on, filtered, censored and disconnected.”

Near the end of the long feature article, he mentions the Semantic Web’s linked data as one of the major new technologies the Web will give birth to, provided the principles are upheld.

“A great example of future promise, which leverages the strengths of all the principles, is linked data. Today’s Web is quite effective at helping people publish and discover documents, but our computer programs cannot read or manipulate the actual data within those documents. As this problem is solved, the Web will become much more useful, because data about nearly every aspect of our lives are being created at an astonishing rate. Locked within all these data is knowledge about how to cure diseases, foster business value and govern our world more effectively.”

One of the benefits of linked data is that it makes data integration and fusion much easier. The benefit comes with a potential risk, which Berners-Lee acknowledges.

“Linked data raise certain issues that we will have to confront. For example, new data-integration capabilities could pose privacy challenges that are hardly addressed by today’s privacy laws. We should examine legal, cultural and technical options that will preserve privacy without stifling beneficial data-sharing capabilities.”

The risk is not unique to linked data, and new research is underway, in our lab and elsewhere, on how to also use Semantic Web technology to protect privacy.

WSJ: many Facebook apps transmit user IDs to advertising and tracking companies

Sunday, October 17th, 2010

This Wall Street Journal article says that many of the most popular of the 550,000 Facebook apps (!) have been transmitting identifying information about users and their friends to dozens of advertising and Internet tracking companies.

“The apps reviewed by the Journal were sending Facebook ID numbers to at least 25 advertising and data firms, several of which build profiles of Internet users by tracking their online activities.

Defenders of online tracking argue that this kind of surveillance is benign because it is conducted anonymously. In this case, however, the Journal found that one data-gathering firm, RapLeaf Inc., had linked Facebook user ID information obtained from apps to its own database of Internet users, which it sells. RapLeaf also transmitted the Facebook IDs it obtained to a dozen other firms, the Journal found.

RapLeaf said that transmission was unintentional. “We didn’t do it on purpose,” said Joel Jewitt, vice president of business development for RapLeaf.”

Update: Facebook responds.

New Facebook Groups Considered Somewhat Harmful

Thursday, October 7th, 2010

I always think of things I should have added in the hour after making a post. Sigh. Here goes…

The situation is perhaps not so different from mailing lists, Google groups or any number of similar systems. I can set up one of those and add people to them without their consent — even people who are are not my friends. Even people whom I don’t know and who don’t know me. Such email-oriented lists can also have public membership lists. The only check on this is that most mailing lists frameworks send a notice to people being added informing them of the action. But many frameworks allow the list owner to suppress such notifications.

But still, Facebook seems different, based on the how the rest of it is configured and on how people use it. I believe that a common expectation would be that if you are listed as a member of an open or private group, that you are a willing member.

When you get a notification that you are now a member of the Facebook group Crazy people who smell bad, you can leave the group immediately. llBut we have Facebook friends, many of them in fact, who only check in once a month or even less frequently. Notifications of their being added to a group will probably be missed.

Facebook should fix this by requiring that anyone added to a group confirm that they want to be in the group before they become members. After fixing it, there’s lots more that can be done to make Facebook groups a powerful way for assured information sharing.

New Facebook Groups Considered Harmful

Thursday, October 7th, 2010

Facebook has rolled out a new version of groups announced on the Facebook blog.

“Until now, Facebook has made it easy to share with all of your friends or with everyone, but there hasn’t been a simple way to create and maintain a space for sharing with the small communities of people in your life, like your roommates, classmates, co-workers and family.

Today we’re announcing a completely overhauled, brand new version of Groups. It’s a simple way to stay up to date with small groups of your friends and to share things with only them in a private space. The default setting is Closed, which means only members see what’s going on in a group.”

There are three kinds of groups: open, closed and secret. Open groups have public membership listings and public content. Private ones have public membership but public but private content. For secret groups, both the membership and content are private.

A key part of the idea is that the group members collectively define who is in the group, spreading the work of setting up and maintaining the group over many people.

But a serious issue with the new Facebook group framework is that a member can unilaterally add any of their friends to a group. No confirmation is required by the person being added. This was raised as an issue by Jason Calacanis.

The constraint that one can only add Facebook friend to a group he belongs to does offer some protection against ending up in unwanted groups (e.g., by spammers). But it could still lead to problems. I could, for example, create a closed group named Crazy people who smell bad and add all of my friends without their consent. Since the group is not secret like this one, anyone can see who is in the group. Worse yet, I could then leave the group. (By the way, let me know if you want to join any of these groups).

While this might just be an annoying prank, it could spin out of control — what might happen if one of your so called friends adds you to the new, closed “Al-Queda lovers” group?

The good news is that this should be easy to fix. After all, Facebook does require confirmation for the friend relation and has a mechanism for recommending that friends like pages or try apps. Either mechanism would work for inviting others to join groups.

We have started working with a new group-centric secure information sharing model being developed by Ravi Sandhu and others as a foundation for better access and privacy contols in social media systems. It seems like a great match.

See update.

Taintdroid catches Android apps that leak private user data

Thursday, September 30th, 2010

Ars Technica has an an article on bad Android apps, Some Android apps caught covertly sending GPS data to advertisers.

“The results of a study conducted by researchers from Duke University, Penn State University, and Intel Labs have revealed that a significant number of popular Android applications transmit private user data to advertising networks without explicitly asking or informing the user. The researchers developed a piece of software called TaintDroid that uses dynamic taint analysis to detect and report when applications are sending potentially sensitive information to remote servers.

They used TaintDroid to test 30 popular free Android applications selected at random from the Android market and found that half were sending private information to advertising servers, including the user’s location and phone number. In some cases, they found that applications were relaying GPS coordinates to remote advertising network servers as frequently as every 30 seconds, even when not displaying advertisements. These findings raise concern about the extent to which mobile platforms can insulate users from unwanted invasions of privacy.”

TaintDroid is an experimental system that “analyses how private information is obtained and released by applications ‘downloaded’ to consumer phones”. A paper on the system will be presented at the 2010 USENIX Symposium on Operating Systems Design and Implementation later this month.

TaintDroid: An Information-Flow Tracking System for Realtime Privacy Monitoring on Smartphones, William Enck, Peter Gilbert, Byung-gon Chun, Landon P. Cox, Jaeyeon Jung, Patrick McDaniel, and Anmol N. Sheth, OSDI, October 2010.

The project, Realtime Privacy Monitoring on Smartphones has a good overview site with a FAQ and demo.

This is just one example of a rich and complex area full of trade-offs. We want our systems and devices to be smarter and to really understand us — our preferences, context, activities, interests, intentions, and pretty much everything short of our hopes and dreams. We then want them to use this knowledge to better serve us — selecting music, turing the ringer on and off, alerting us to relevant news, etc. Developing this technology is neither easy nor cheap and the developers have to profit from creating it. Extracting personal information that can be used or sold is one model — just as Google and others do to provide better ad placement on the Web.

Here’s a quote from the Ars Technical article that resonated with me.

“As Google says in its list of best practices that developers should adopt for data collection, providing users with easy access to a clear and unambiguous privacy policy is really important.”

We, and many others, are trying to prepare for the next step — when users can define their own privacy policies and these will be understood and enforced by their devices.

Is Twitters plan to log all clicks a privacy loss?

Thursday, September 2nd, 2010

Twitter’s planned shortening of all links via its t.co service is about to happen. The initial motivation was security, according to Twitter:

“Twitter’s link service at http://t.co is used to better protect users from malicious sites that engage in spreading malware, phishing attacks, and other harmful activity. A link converted by Twitter’s link service is checked against a list of potentially dangerous sites. When there’s a match, users can be warned before they continue.”

Declan McCullagh reports that Twitter announced in an email message that when someone click “on these links from Twitter.com or a Twitter application, Twitter will log that click.” Such information is extremely valuable. Give Twitter’s tens of millions of active users, just knowing how often certain URLs are clicked by people indicates what entities and topics are of interest at the moment.

“Our link service will also be used to measure information like how many times a link has been clicked. Eventually, this information will become an important quality signal for our Resonance algorithm—the way we determine if a Tweet is relevant and interesting.”

Associating the clicks with a user, IP address, location or device can yield even more information — like what you are interested in right now. Moreover, Twitter now has a way to associate arbitrary annotation metadata with each tweet. Analyzing all of this data can identify, for example, communities of users with common interests and the influential members within them.

Note that Twitter has not said it will do this or even that it will record and keep any user-identifiable information along with the clicks. They might just log the aggregate number of clicks in a window of time. But going the next step and capturing the additional information would be, in my mind, irresistible, even if there was no immediate plan to use it.

Search engines like Google already link clicks to users and IP addresses and use the information to improve their ranking algorithms and probably in many other ways. But what is troubling is the seemingly inexorable erosion of our online privacy. There will be no way to opt out of having your link wrapped by the t.co service and no announced way to opt out of having your clicks logged.

An ontology of social media data for better privacy policies

Sunday, August 15th, 2010

Privacy continues to be an important topic surrounding social media systems. A big part of the problem is that virtually all of us have a difficult time thinking about what information about us is exposed and to whom and for how long. As UMBC colleague Zeynep Tufekci points out, our intuitions in such matters come from experiences in the physical world, a place whose physics differs considerably from the cyber world.

Bruce Schneier offered a taxonomy of social networking data in a short article in the July/August issue of the IEEE Security & Privacy. A version of the article, A Taxonomy of Social Networking Data, is available on his site.

“Below is my taxonomy of social networking data, which I first presented at the Internet Governance Forum meeting last November, and again — revised — at an OECD workshop on the role of Internet intermediaries in June.

  • Service data is the data you give to a social networking site in order to use it. Such data might include your legal name, your age, and your credit-card number.
  • Disclosed data is what you post on your own pages: blog entries, photographs, messages, comments, and so on.
  • Entrusted data is what you post on other people’s pages. It’s basically the same stuff as disclosed data, but the difference is that you don’t have control over the data once you post it — another user does.
  • Incidental data is what other people post about you: a paragraph about you that someone else writes, a picture of you that someone else takes and posts. Again, it’s basically the same stuff as disclosed data, but the difference is that you don’t have control over it, and you didn’t create it in the first place.
  • Behavioral data is data the site collects about your habits by recording what you do and who you do it with. It might include games you play, topics you write about, news articles you access (and what that says about your political leanings), and so on.
  • Derived data is data about you that is derived from all the other data. For example, if 80 percent of your friends self-identify as gay, you’re likely gay yourself.”

I think most of us understand the first two categories and can easily choose or specify a privacy policy to control access to information in them. The rest however, are more difficult to think about and can lead to a lot of confusion when people are setting up their privacy preferences.

As an example, I saw some nice work at the 2010 IEEE International Symposium on Policies for Distributed Systems and Networks on “Collaborative Privacy Policy Authoring in a Social Networking Context” by Ryan Wishart et al. from Imperial college that addressed the problem of incidental data in Facebook. For example, if I post a picture and tag others in it, each of the tagged people can contribute additional policy constraints that can narrow access to it.

Lorrie Cranor gave an invited talk at the workshop on Building a Better Privacy Policy and made the point that even P3P privacy policies are difficult for people to comprehend.

Having a simple ontology for social media data could help us move forward toward better privacy controls for online social media systems. I like Schneier’s broad categories and wonder what a more complete treatment defined using Semantic Web languages might be like.

Privacy vulnerability in Apple Safari

Thursday, July 22nd, 2010

Apple’s Safari browser has a privacy vulnerability allowing web sites you visit to extract your personal information (e.g., name, address, phone number) from your computer’s address book. The fix is to turn off Safari’s web form autofill feature, which is selected by default (Preferences > AutoFill > AutoFill web form).


prefs

It’s an interesting Javascript exploit that does not seem to be a problem for other browsers.

Google Dashboard shows what data Google has about you

Thursday, November 5th, 2009

Google added a great new service, Dashboard, that summarizes data stored for a Google account — see MY ACCOUNT>PERSONAL SETTINGS>DASHBOARD.

“Designed to be simple and useful, the Dashboard summarizes data for each product that you use (when signed in to your account) and provides you direct links to control your personal settings. Today, the Dashboard covers more than 20 products and services, including Gmail, Calendar, Docs, Web History, Orkut, YouTube, Picasa, Talk, Reader, Alerts, Latitude and many more. The scale and level of detail of the Dashboard is unprecedented, and we’re delighted to be the first Internet company to offer this — and we hope it will become the standard.”

This is a good move on Google’s part. But while there’s a lot of information included, it’s not everything that Google knows about you — e.g., data in cookies, click throughs data from search results and information from companies it’s acquired, like Doublclick.

Still, it is a big step in a positive direction toward privacy awareness.