Concept: Semantic Number

Flattening a vocabulary to a single number (word ID) such that the distance between ID’s best approximates the semantic distance between the words.

Initialize:

  1. start with a random word, assign it to ID 0
  2. for the next ID, assign the word most related to the previous word but which has not been output yet
  3. repeat until all words have been mapped to an ID

Refine:

  1. Propose a random swap between a pair of word->ID mappings
  2. If the random swap results in a reduced divergence between distances in ID space and in semantic space, make the swap
  3. Proceed until convergence

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *