ConceptNet 5

About ConceptNet

ConceptNet is a semantic network containing lots of things computers should know about the world, especially when understanding text written by people.

It is built from nodes representing words or short phrases of natural language, and labeled relationships between them. (We call the nodes "concepts" for tradition, but they'd be better known as "terms".) These are the kinds of relationships computers need to know to search for information better, answer questions, and understand people's goals.

ConceptNet contains everyday basic knowledge:

learn MotivatedByGoal knowledge

You would learn because you want knowledge.

Cultural knowledge:

saxophone UsedFor jazz

A saxophone is used for jazz.

And scientific knowledge:

semantic role HasContext linguistics

"Semantic role" is a term in linguistics.

It would not adequately represent human knowledge if it didn't contain other languages besides English, as well:

本 MadeOf 紙

本は紙でできている。 (A book is made of paper.)

Notice how the relations between concepts can be abstract notions such as MadeOf, which we use to mean the same thing across all languages; or they can be language-specific text such as "can cross".

You can click any of these links, or use the search bar above, to begin browsing ConceptNet.

API and Documentation

A diagram of part of the structure of ConceptNet.

The newest release of ConceptNet, ConceptNet 5.4, is documented on our wiki.

The documentation includes how to use our REST API, which allows you to:

Retrieve the data for particular nodes and edges
Query for edges with given properties
Measure and query the semantic distance between nodes

It also describes the structure of ConceptNet and tells you about the various ways that you can access the ConceptNet data on your own computer.

ConceptNet 5 is free

ConceptNet 5 comes largely from the hard work of hundreds of thousands of people who gave their time and knowledge for free. So ConceptNet is free, open knowledge as well.

You can get the entirety of ConceptNet 5 under the Creative Commons Attribution-ShareAlike 4.0 license. See Copying and sharing ConceptNet for more details.

To give proper attribution to ConceptNet, we suggest this text:

This work includes data from ConceptNet 5, which was compiled by the Commonsense Computing Initiative. ConceptNet 5 is freely available under the Creative Commons Attribution-ShareAlike license (CC BY SA 3.0) from http://conceptnet5.media.mit.edu. The included data was created by contributors to Commonsense Computing projects, contributors to Wikimedia projects, Games with a Purpose, Princeton University's WordNet, DBPedia, OpenCyc, and Umbel.

Sources and how to contribute

Previous versions of ConceptNet were a home-grown crowd-sourced project, where we ran a Web site collecting facts from people who came to the site. The Web of Data is much bigger than that now. Our data comes from many different sources, many of which you can contribute to and improve not just the state of computational knowledge, but of human knowledge.

To begin with, ConceptNet 5 contains almost all the data from ConceptNet 4, created by contributors to the Open Mind Common Sense project.
We connect to a subset of DBPedia, which extracts knowledge from the infoboxes on Wikipedia articles.
Much of our knowledge comes from Wiktionary, the free multilingual dictionary, a sister project to Wikipedia. This gives us information about synonyms, antonyms, translations of concepts into hundreds of languages, and multiple labeled word senses for many words.
More dictionary-style knowledge comes from WordNet.
UMBEL connects ConceptNet to the OpenCyc ontology via a Semantic Web representation.
Some knowledge about people's intuitive word associations comes from "games with a purpose". We learn things in English from the GWAP project's word game Verbosity, and in Japanese from nadya.jp.

ConceptNet supports linked data: you can download a list of links to the greater Semantic Web, via DBPedia, UMBEL, and RDF/OWL WordNet. For example, our concept cat is linked to the DBPedia node at http://dbpedia.org/resource/Cat.

Downloading ConceptNet 5

If you want all the data in ConceptNet for your application, you can have it! We provide the data in various forms:

Flat files of all the assertions in ConceptNet, in JSON, msgpack, and CSV formats. These help to make the data usable without needing specific library support.
A SQLite database that indexes these flat files, allowing you to search ConceptNet on your own computer.
A Docker image that makes the entire ConceptNet build process reproducible.

Development

Current development of ConceptNet takes place as an open-source project of Luminoso Technologies, Inc., in collaboration with the MIT Media Lab, which provides its hosting. The code that builds and powers ConceptNet is available on GitHub.

ConceptNet originated at the MIT Media Lab, and became part of the Commonsense Computing Initiative, a collaboration between MIT and other labs and companies around the world. This global collaboration helps us collect relational knowledge in many languages. The Commonsense Computing Initiative was founded by Catherine Havasi, now the CEO of Luminoso.

The development of ConceptNet 5 is led by Rob Speer, a Luminoso co-founder, with contributions from several other people.

Mailing list and contact information

For general questions and further information, join our mailing list on Google Groups.