Resource Identification in RDF

This is one of those “Oh, sh*t, that was totally unnecessary and/or pointless in hindsight” articles. This particular one is pointless because I had a complete misunderstanding. The first section describes the misunderstanding I had, and the second section is what I wrote initially before I realized my misunderstanding, so the discussion there is naturally off.

Resource Identification in RDF

The way Tim Berners-Lee described his Linked Data concept gave me the impression that resources need to be accessible via HTTP, but I was wrong.

The section “Resource Identification” in the “Resource Description Framework (RDF)” Wikipedia page says this:

In fact, the URI that names a resource does not have to be dereferenceable at all. For example, a URI that begins with “http:” and is used as the subject of an RDF statement does not necessarily have to represent a resource that is accessible via HTTP, nor does it need to represent a tangible, network-accessible resource — such a URI could represent absolutely anything. However, there is broad agreement that a bare URI (without a # symbol) which returns a 300-level coded response when used in an HTTP GET request should be treated as denoting the internet resource that it succeeds in accessing.

My Problem with Tim Berners-Lee’s Ideas of Linked Data

Note: This section contains falsehood, which resulted from my misunderstanding. Please see the previous section.

According to “Linked data,” Tim Berners-Lee described his Linked Data (LD) concept as follows (2009 version):

  1. All kinds of conceptual things, they have names now that start with HTTP.
  2. If I take one of these HTTP names and I look it up…I will get back some data in a standard format which is kind of useful data that somebody might like to know about that thing, about that event.
  3. When I get back that information it’s not just got somebody’s height and weight and when they were born, it’s got relationships. And when it has relationships, whenever it expresses a relationship then the other thing that it’s related to is given one of those names that starts with HTTP.

Such vague phrases as “names that start with HTTP” and “HTTP names” leave room for interpretation, but since he used the phrase “HTTP URIs” in his 2006 memo on the same topic, it should be safe to assume he meant accessing resources on the Internet by way of the HTTP protocol.

I do not think the use of the HTTP protocol should be an essential part of the LD concept. The transport can be anything. What I do not like about HTTP URIs is that by definition, hostnames need to exist, which then means domain names need to exist as well, having been registered with an domain name registrar under the supervision of ICANN. I do see the convenience of using a hostname as a name space separator because it is guaranteed to be unique, but I do not think it needs to be the only way. Note URIs in general, as opposed to HTTP URIs, do not necessarily have hostnames or domain names as part of them.

Even when you remove the HTTP protocol mandate from TimBL’s LD concept, it is still a valid and useful concept as long as you provide an alternative method or methods of uniquely identifying data and accessing them. In other words, the LD concept can be transport agnostic. When it is, it is no longer directly tied to the World Wide Web as we know it — but it still constitutes a web of information.

A benefit of detaching data’s identity from hostnames is that data can now be hosted on a P2P network.

I wrote “Some Personal Notes on the Semantic Web” as objectively as possible, but this article was originally meant for a place to express my own ideas… and look what happened.

Resource Identification in RDF」への1件のフィードバック

  1. ピンバック: 政府統計を機械判読可能にするために提言した | あくまで暫定措置としてのブログ


以下に詳細を記入するか、アイコンをクリックしてログインしてください。 ロゴ アカウントを使ってコメントしています。 ログアウト /  変更 )

Twitter 画像

Twitter アカウントを使ってコメントしています。 ログアウト /  変更 )

Facebook の写真

Facebook アカウントを使ってコメントしています。 ログアウト /  変更 )

%s と連携中

このサイトはスパムを低減するために Akismet を使っています。コメントデータの処理方法の詳細はこちらをご覧ください