Astrolabe: A Tool to Explore Relationships Between Similar Words
Posted on by Gentaro "hibariya" Terada
I am creating a tiny tool to explore relationships between two words utilizing WordNet: Astrolabe. It is intended to help ESL learners like me to grasp a clearer picture of a particular sense by recognizing the relationships between two words that have something in common.
Motivation
I have been trying to extend my vocabulary with Anki for years, little by little. It is an awesome tool to learn new words. I have learned about 6,000 words with it. Thanks to that, now it is much easier to read books and discuss diverse topics in English.
However, sometimes when creating a sentence, I still feel maybe I am not choosing the correct word while building a sentence because there are a lot of similar words for me like remove and delete; prevent and avoid; relation and association; endow and impart; and so on. The explanations of each pair in the dictionary are similar and I cannot get a clear picture of the difference.
That distracts me and makes me hesitate to write and speak. I might have inadvertently failed to communicate with others sometimes. I want to talk more confidently. To that end, it is necessary to understand the worlds I already know more deeply.
What Astrolabe Does to Solve the Problem
There are lots of corpus tools you can use online like SKELL and COCA. They are useful to learn the practical usage of a word. You can study real example sentences for a particular part-of-speech of a word and compare them with the usages of related words.
There is also another kind of tool: WordNet. This is a huge database that contains information about relationships between words. Here's the original motivation to create Astrolabe: if WordNet can explain the relationship between different words, it should be useful to learn the difference between two words that have something in common.
WordNet is a huge graph of words. Astrolabe traces the graph and finds paths from one word to another. The following example is a path between “remove” and “delete”.
Sense#173351: remove#1;take#17;take_away#2;withdraw#12; (Verb)
(remove something concrete, as by lifting, pushing, or taking off, or remove something abstract; "remove a threat"; "remove a wrapper"; "Remove the dirty dishes from the table"; "take the gun from your pocket"; "This machine withdraws heat from the environment")
-- Hyponym --> Sense#1551969: delete#1;cancel#4; (Verb)
(remove or make invisible; "Please delete my name from your list")
-- Hyponym -->
represents the edge of the path. This indicates that in a sense, “delete” is a hyponym — a more specific word — of “remove”.
Another example below shows that “banish” and “relegate” have common hypernyms — more generic words —. A couple of words like this is called coordinate terms or sisters.
Sense#2504017: banish#1;ban#4;ostracize#1;ostracise#1;shun#2;cast_out#1;blackball#1; (Verb)
(expel from a community or group)
-- Hypernym --> Sense#2501738: expel#1;throw_out#1;kick_out#1; (Verb)
(force to leave or move out; "He was expelled from his native country")
-- Hyponym --> Sense#2503803: banish#3;relegate#3;bar#3; (Verb)
(expel, as if by official decree; "he was banished from his own country")
This result reveals one sense these two words share; they are more specific words of the same word “expel”. Now we can focus on how to expel and expel from what to learn the differences between them. The short descriptions and a few example sentences would help.
To understand a word, the definitions in a dictionary are simply not enough. The information about how to use it and how different from other words is also important. Corpora like SKELL and other tools like WordNet are ESL learners’ good friends and maybe Astrolabe could also be useful sometimes. We'll see.
Note that there are some limitations. Since WordNet treats synonyms that share a particular sense as one synset, often a subtle nuance between words cannot be drawn with it. Also, because of the nature of its data structure, the result of related paths could be inconsistent even if for the same combination of words when the order is different.
How to Use
It only has a simple command-line interface. All you can do is just pass two words. If there is any path between the two, it will output the found ones.
$ ./dist/bin/astrolabe harmful deleterious
From: "harmful", To: "deleterious"
Finding "harmful" (Adjective)
Sense#1163575: harmful#1; (Adjective)
(causing or capable of causing harm; "too much sun is harmful to the skin"; "harmful effects of smoking")
-- Similar --> Sense#1164603: deleterious#1;hurtful#2;injurious#1; (Adjective)
(harmful to living things; "deleterious chemical additives")
In any case, you have to build it from the source on your own first. I should have chosen any other language that is easier to build. Sorry about that. Maybe in the future, if I didn't give up the idea, a more friendly user interface on the internet would be great to have.