Archive for the ‘semantic web’ Category

AAAI Symposium on Open Government Knowledge, 4-6 Nov 2010, Arlington VA

Wednesday, November 2nd, 2011

If you are in the DC area this weekend and are interested in using Semantic Web technologies, you should come to the AAAI 2011 Fall Symposium on Open Government Knowledge: AI Opportunities and Challenges. It runs from Friday to Sunday midday at the he Westin Arlington Gateway in Arlington, Virginia.

Join us to meet the thought governmental and business leaders in US open government data activities, and discuss the challenges. The symposium features Friday (Nov 4) as governmental day with speakers on,, open gov data activities in NIH/NCI and NASA and Saturday (Nov 5) as R&D day with speakers from industry, including Google and Microsoft, as well international researchers.

This symposium will explore how AI technologies such as the Semantic Web, information extraction, statistical analysis and machine learning, can be used to make the valuable knowledge embedded in open government data more explicit, accessible and reusable.

See the OGK website for complete details.

Tim Berners-Lee on protecting the Web in the December Scientific American

Friday, November 19th, 2010

Sir Tim Berners-Lee discusses the principles underlying the Web and the need to protect them in an article from the December issue of Scientific American, Long Live the Web.

“The Web evolved into a powerful, ubiquitous tool because it was built on egalitarian principles and because thousands of individuals, universities and companies have worked, both independently and together as part of the World Wide Web Consortium, to expand its capabilities based on those principles.

The Web as we know it, however, is being threatened in different ways. Some of its most successful inhabitants have begun to chip away at its principles. Large social-networking sites are walling off information posted by their users from the rest of the Web. Wireless Internet providers are being tempted to slow traffic to sites with which they have not made deals. Governments—totalitarian and democratic alike—are monitoring people’s online habits, endangering important human rights.

If we, the Web’s users, allow these and other trends to proceed unchecked, the Web could be broken into fragmented islands. We could lose the freedom to connect with whichever Web sites we want. The ill effects could extend to smartphones and pads, which are also portals to the extensive information that the Web provides.

Why should you care? Because the Web is yours. It is a public resource on which you, your business, your community and your government depend. The Web is also vital to democracy, a communications channel that makes possible a continuous worldwide conversation. The Web is now more critical to free speech than any other medium. It brings principles established in the U.S. Constitution, the British Magna Carta and other important documents into the network age: freedom from being snooped on, filtered, censored and disconnected.”

Near the end of the long feature article, he mentions the Semantic Web’s linked data as one of the major new technologies the Web will give birth to, provided the principles are upheld.

“A great example of future promise, which leverages the strengths of all the principles, is linked data. Today’s Web is quite effective at helping people publish and discover documents, but our computer programs cannot read or manipulate the actual data within those documents. As this problem is solved, the Web will become much more useful, because data about nearly every aspect of our lives are being created at an astonishing rate. Locked within all these data is knowledge about how to cure diseases, foster business value and govern our world more effectively.”

One of the benefits of linked data is that it makes data integration and fusion much easier. The benefit comes with a potential risk, which Berners-Lee acknowledges.

“Linked data raise certain issues that we will have to confront. For example, new data-integration capabilities could pose privacy challenges that are hardly addressed by today’s privacy laws. We should examine legal, cultural and technical options that will preserve privacy without stifling beneficial data-sharing capabilities.”

The risk is not unique to linked data, and new research is underway, in our lab and elsewhere, on how to also use Semantic Web technology to protect privacy.

Facebook Browser gets a low F1-score in my book

Sunday, September 12th, 2010

Facebook has rolled out Facebook Browser as what sounds like a simple and effective idea — recommend pages based on on a user’s country and social network. My impression is mixed, however. While I like it’s top recommendation for me, I am already a fan. It’s suggestions for the celebrities category are a bust — Rush Limbaugh, Glenn Beck, Michelle Malkin, Mark Levin, Red Green and Bill O’Reilly. And Movies? Don’t even go there! Maybe it’s trying to tell me I need a new set of friends? Inside Facebook summarizes Facebook Browser this way:

“Facebook has launched a new way to “Discover Facebook’s Popular Pages” called Browser. It shows icons of Pages that are popular in a user’s country, but factors in which Pages which are popular amongst their unique friend network. When the Page icons are hovered over they display a Like button. Browser could cause popular Pages to get more popular, widening the gap between them and smaller Pages, similar to the frequently criticized and since abandoned Twitter Suggested User List.”

I think the idea is sound, though, and I like my Facebook friends. So, my conclusion is that Facebook needs to tweak the algorithm.

An ontology of social media data for better privacy policies

Sunday, August 15th, 2010

Privacy continues to be an important topic surrounding social media systems. A big part of the problem is that virtually all of us have a difficult time thinking about what information about us is exposed and to whom and for how long. As UMBC colleague Zeynep Tufekci points out, our intuitions in such matters come from experiences in the physical world, a place whose physics differs considerably from the cyber world.

Bruce Schneier offered a taxonomy of social networking data in a short article in the July/August issue of the IEEE Security & Privacy. A version of the article, A Taxonomy of Social Networking Data, is available on his site.

“Below is my taxonomy of social networking data, which I first presented at the Internet Governance Forum meeting last November, and again — revised — at an OECD workshop on the role of Internet intermediaries in June.

  • Service data is the data you give to a social networking site in order to use it. Such data might include your legal name, your age, and your credit-card number.
  • Disclosed data is what you post on your own pages: blog entries, photographs, messages, comments, and so on.
  • Entrusted data is what you post on other people’s pages. It’s basically the same stuff as disclosed data, but the difference is that you don’t have control over the data once you post it — another user does.
  • Incidental data is what other people post about you: a paragraph about you that someone else writes, a picture of you that someone else takes and posts. Again, it’s basically the same stuff as disclosed data, but the difference is that you don’t have control over it, and you didn’t create it in the first place.
  • Behavioral data is data the site collects about your habits by recording what you do and who you do it with. It might include games you play, topics you write about, news articles you access (and what that says about your political leanings), and so on.
  • Derived data is data about you that is derived from all the other data. For example, if 80 percent of your friends self-identify as gay, you’re likely gay yourself.”

I think most of us understand the first two categories and can easily choose or specify a privacy policy to control access to information in them. The rest however, are more difficult to think about and can lead to a lot of confusion when people are setting up their privacy preferences.

As an example, I saw some nice work at the 2010 IEEE International Symposium on Policies for Distributed Systems and Networks on “Collaborative Privacy Policy Authoring in a Social Networking Context” by Ryan Wishart et al. from Imperial college that addressed the problem of incidental data in Facebook. For example, if I post a picture and tag others in it, each of the tagged people can contribute additional policy constraints that can narrow access to it.

Lorrie Cranor gave an invited talk at the workshop on Building a Better Privacy Policy and made the point that even P3P privacy policies are difficult for people to comprehend.

Having a simple ontology for social media data could help us move forward toward better privacy controls for online social media systems. I like Schneier’s broad categories and wonder what a more complete treatment defined using Semantic Web languages might be like.

Semantic Web seen as a distruptive technology

Friday, August 13th, 2010

Washington Technology, which describes itself as “the online authority for government contractors and partners”), has an article by Carlos A. Soto on 5 technologies that will change the market. They are:

  1. Mobile
  2. Search and the Semantic Web
  3. Search and the Semantic Web
  4. Virtualization and cloud computing
  5. Virtualization and cloud computing

These are reasonable choices, thought I’ve have not done the double counting and added “machine learning applied to the massive amounts of Web data now available” and “social computing”.

But it’s gratifying to see the Semantic Web in the list. Here’s some of what he he has to say about search and the Semantic Web.

The relationship between search technology and the Semantic Web is a perfect illustration of how a small sustaining technology, such as a basic search feature on an operating system, will eventually be eaten up by a larger disruptive technology, such as the Semantic Web. The Semantic Web has the potential of acting like a red giant star by expanding at exponential rates, swallowing whole planets of existing technology in the process.

The technology started as a simple group of secure, trusted, linked data stores. Now Semantic Web technologies enable people to create data stores on the Web and then build vocabularies or write rules for handling the data. Because all the data by definition is trusted, security is often less of a problem.

The task of turning the World Wide Web into a giant dynamic database is causing a shift among traditional search engines because products such as Apture, by Apture Inc. of San Francisco, Calif., let content publishers include pop-up definitions, images or data whenever a user scrolls over a word on a Web site. The ability to categorize content in this manner could have significant implications not only for Web searches but also for corporate intranets and your desktop PC.

These types of products will continue to expand, initially in the publishing industry and then to most industries on the Web in the next two to three years.

For example, human resources sites could use them to pop up a picture and a résumé blip when a recruiter drags a mouse over an applicant’s name. Medical and financial sites such as the National Institutes of Health could use it to break down jargon and help with site exploration.

Government sites around the world, such as Zaragoza, Spain, and medical facilities, such as the Cleveland Medical Clinic, are using the vocabulary features of the Semantic Web to create search engines that reach across complex jargon and tech silos to offer a high degree of automation, full integration with external systems and various terminologies, in addition to the ability to accurately answer users’ queries.

(h/t @FrankVanHarmele)

Tools for secure cloud computing

Friday, August 6th, 2010

University of Texas at Dallas AISL researchers have released software tools designed to facilitate cloud computing. “In order to use electricity, we do not maintain electricity generators at home, instead we get the electricity on demand from the grid when we need it,” says UTD Cyber Security Research Center director and AISL project CO-PI Bhavani Thuraisingham. Read the full story here

The first release of the UT Dallas team’s cloud-computing resources
feature a repository consisting of a collection of tools that provide secure query processing capabilities, preventing unauthorized access to sensitive data. Tools are also being developed to add security to data storage services by storing sensitive data in encrypted format.

JWS special issue on Provenance and Semantic Web

Monday, July 19th, 2010

Journal of Web Semantics Special Issue on
Using Provenance in the Semantic Web

Editors: Yolanda Gil, University of Southern California’s Information Sciences Institute and Paul Groth, Free University of Amsterdam

The Web is a decentralized system full of information provided by diverse open sources of varying quality. For any given question there will be a multitude of answers offered, raising the need for assessing their relative value and for making decisions about what sources to trust. In order to make effective use of the Web, we routinely evaluate the information we get, the sources that provided it, and the processes that produced it. A trust layer was always present in the Web architecture, and Berners-Lee envisioned an “oh-yeah?” button in the browser to check the sources of an assertion. The Semantic Web raises these questions in the context of automated applications (e.g. reasoners, aggregators, or agents), whether trying to answer questions using the Linked Data cloud, use a mashup appropriately or determine trust on a social network. Therefore, provenance is an important aspect of the Web that becomes crucial in Semantic Web research.

This special issue on Using Provenance in the Semantic Web of the Journal of Web Semantics aims to collect representative research in handling provenance while using and reasoning about information and resources on the web. Provenance has been addressed in a variety of areas in computer science targeting specific contexts, such as databases and scientific workflows. Provenance is important in a variety of contexts, including open science, open government, and intellectual property and copyright. Provenance requirements must be understood for specific kinds of Web resources, such as documents, services, ontologies, workflows, and datasets.

We seek high quality submissions that describe recent projects, articulate research challenges, or put forward synergistic perspectives on provenance. We solicit submissions that advance the Semantic Web through exploiting provenance, addressing research issues including:

  • representing provenance
  • relating provenance to the underlying data and information
  • managing provenance in a distributed web
  • reasoning about trust based on provenance
  • handling incomplete provenance
  • taking advantage of the web’s structure for provenance

Submissions may focus on uses of provenance in the Semantic Web for:

  • linked data
  • social networking
  • data integration
  • inference from diverse sources
  • trust and proof

Papers may also focus on application areas, highlighting the challenges and benefits of using provenance:

  • provenance in open science
  • provenance in open government
  • provenance in copyright and intellectual property for documents
  • provenance in web publishing

Important Dates

We will aim at an efficient publication cycle in order to guarantee prompt availability of the published results. We will review papers on a rolling basis as they are submitted and explicitly encourage submissions well before the submission deadline. Submit papers online at the journal’s Elsevier Web site.

  • Submission deadline: 5 September 2010
  • Author notification: 15 December 2010
  • Revisions submitted: 1 February 2010
  • Final decisions: 15 March 2011
  • Publication: 1 April 2011

Submission guidelines

The Journal of Web Semantics solicits original scientific contributions of high quality. Following the overall mission of the journal, we emphasize the publication of papers that combine theories, methods and experiments from different subject areas in order to deliver innovative semantic methods and applications. The publication of large-scale experiments and their analysis is also encouraged to clearly illustrate scenarios and methods that introduce semantics into existing Web interfaces, contents and services. Submission of your manuscript is welcome provided that it, or any translation of it, has not been copyrighted or published and is not being submitted for publication elsewhere. Upon acceptance of an article, the author(s) will be asked to transfer copyright of the article to the publisher. This transfer will ensure the widest possible dissemination of information. Manuscripts should be prepared for publication in accordance with instructions given in the “Guide for Authors” (available from the publisher), details can be found online. The submission and review process will be carried out using Elsevier’s Web-based EES system. Final decisions of accepted papers will be approved by an editor in chief.

About the Journal of Web Semantics

The Journal of Web Semantics is published by Elsevier since 2003. It is an interdisciplinary journal based on research and applications of various subject areas that contribute to the development of a knowledge-intensive and intelligent service Web. These areas include: knowledge technologies, ontology, agents, databases and the semantic grid, obviously disciplines like information retrieval, language technology, human-computer interaction and knowledge discovery are of major relevance as well. All aspects of the Semantic Web development are covered. The current Editors-in-Chief are Tim Finin, Riichiro Mizoguchi and Steffen Staab. For all editors information, see our site.

The Journal of Web Semantics offers to its authors and readers:

  • Professional support with publishing by Elsevier staff
  • Indexed by Thomson-Reuters web of science
  • Impact factor 3.41: the third highest out of 92 titles in Thomson-Reuters’ category “Computer Science, Information Systems

Creating more secure cloud computing environments

Saturday, July 10th, 2010

The Air Force recently highlighted some of our AISL MURI research done at the University of Texas in Dallas on developing solutions for maintaining privacy in cloud computing environments.

The work is part of a three year project funded by the Air Force Office of Scientific Research aimed at understanding the fundamentals of information sharing and developing new approaches to making it easier to do so securely.

Dr. Bhavani Thuraisingham has put together a team of researchers from the UTD School of Management and its School of Economics, Policy and Political Sciences to investigate information sharing with consideration to confidentiality and privacy in cloud computing.

“We truly need an interdisciplinary approach for this,” she said. “For example, proper economic incentives need to be combined with secure tools to enable assured information sharing.”

Thuraisingham noted that cloud computing is increasingly being used to process large amounts of information. Because of this increase, some of the current technologies are being modified to be useful for that environment as well as to ensure security of a system.

To achieve their goals, the researchers are inserting new security programming directly into software programs to monitor and prevent intrusions. They have provided additional security by encrypting sensitive data that is not retrievable in its original form without accessing encryption keys. They are also using Chinese Wall, which is a set of policies that give access to information based on previously viewed data.

The scientists are using prototype systems that can store semantic web data in an encrypted form and query it securely using a web service that provides reliable capacity in the cloud. They have also introduced secure software and hardware attached to a database system that performs security functions.

Assured information sharing in cloud computing is daunting, but Thuraisingham and her team are creating both a framework and incentives that will be beneficial to the Air Force, other branches of the military and the private sector.

The next step for Thuraisingham and her fellow researchers is examining how their framework operates in practice.

“We plan to run some experiments using online social network applications to see how various security and incentive measures affect information sharing,” she said.

Thuraisingham is especially glad that AFOSR had the vision to fund such an initiative that is now becoming international in its scope.

“We are now organizing a collaborative, international dimension to this project by involving researchers from Kings College, University of London, University of Insubria in Italy and UTD related to secure query processing strategies,” said AFOSR program manager, Dr. Robert Herklotz.