Games With a Purpose
Luis von Ahn and his team have launched gwap.com, a collection of fun games which also capture machine readable knowledge.
Computable Common Sense
Luis von Ahn and his team have launched gwap.com, a collection of fun games which also capture machine readable knowledge.
Mike Bergman began an interesting series about the history and motivations of the UMBEL project over the weekend. Stay tuned.
The Cyc Foundation has a new Facebook group. If you are a member, please show your support and invite a friend!
Also on the topic of ontology alignment, TreeJuxtaposer is a wonderful comparison tool by Tamara Munzner.
You Say - We Say is an interesting visualization of folksonomy/ontology alignment.
How do you efficiently import a bunch of terms and assertions into an OpenCyc instance? Recently I looked at a few approaches. I found that Cyc’s “KEText” file format works well. Here are the docs on this format: KEText. The following SubL command then comes in handy, to load a file in KEText format:
(load-ke-text-file #$CycAdministrator “C:/my.ketext.txt” :agenda t)
To load very large amounts of data you’ll need to break your KEText files up into fairly small chunks, as there are limits to how much data the SubL command can process.
You could alternatively use the OpenCyc Java API, which provides everything you need to make additions and changes to a Cyc image. In my case I had a bunch of information in XML already, so transforming the XML into KEText format was an easier way to go.
In the history of the Cyc project, Cyc’s knowledge base and inference engine have evolved in a direction far different from most other automated theorem provers. Cyc concentrated on solving problems in very large knowledge spaces (i.e., millions of facts), using higher-order logic, although the problem solutions were often not very deep. The automated theorem proving community, on the other hand, looked at relatively small knowledge spaces (or theorems), but focused on becoming very fast at finding very deep solutions.
To date, there has been fairly little cross-pollination between the two communities. In part, this has been because there was no corpus of problems accessible by both Cyc and automated theorem provers. Now, however, just such a problem corpus has been released and made available in the TPTP (Thousands of Problems for Theorem Provers) format that is the standard for automated theorem proving researchers. More information about this problem suite can be found at
http://www.opencyc.org/doc/tptp_challenge_problem_set.html.
YAGO is a “Yet Another Great Ontology”, and utlizes the category pages in Wikipedia to link Wikipedia pages to synsets in WordNet. Cyc already has a mapping from WordNet to Cyc concepts, so some merger of the two or reusing the techniques would be interesting to explore.
An interesting promotion that you’ll only see in the age of the Semantic Web. Franz, Inc. is selling a triple store called AllegroGraph, and the free version stores up to 50,000,000 triples.
OpenCyc was used as an example data set for load testing.
About twice per year, Cycorp offers a 3-day, hands-on course to introduce folks to semantic modeling, ontology development, automated reasoning, natural language processing, and other aspects of the Cyc technology. The next session of this class will be held from October 24th - 26th in Austin, Texas. For more information, see www.cyc.com/cyc101.