同义与上下位关系挖掘

last modify

关键词: synonym/synonymous/aliase (同义词), antonym (反义词), hypernym/hyperonym/hypernymy (上位词), hyponym/hyponymy (下位词) synonym/hypernym/hypernym-hyponym extraction (抽取)/detection (检测)/discovery (发现)/identification (识别)/generation (生成)

因为上位和下位关系是可以通过调整顺序互换的, 文献中一般使用 hypernym. (A practical reason to prefer hyperonym is that hypernym is in its spoken form hard to distinguish from hyponym in most dialects of English.)

资源

参考

上下位

  • Atzori M, Balloccu S. Fully-unsupervised embeddings-based hypernym discovery[J]. Information, 2020, 11(5): 268.

    相关工作介绍比较详细;

同义词

博客

论文

  • Cheng T, Lauw H W, Paparizos S. Entity synonyms for structured web search[J]. IEEE transactions on knowledge and data engineering, 2011, 24(10): 1862-1875.

    微软; Click Similarity (ClickSim)

    • Cheng T, Lauw H W, Paparizos S. Fuzzy matching of web queries to structured data[C]//2010 IEEE 26th International Conference on Data Engineering (ICDE 2010). IEEE, 2010: 713-716.

      最早提出 ClickSim 的论文;

  • Turney P D. Mining the web for synonyms: PMI-IR versus LSA on TOEFL[C]//European conference on machine learning. Springer, Berlin, Heidelberg, 2001: 491-502.

    Document Similarity (DocSim)

  • Chakrabarti K, Chaudhuri S, Cheng T, et al. A framework for robust discovery of entity synonyms[C]//Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining. 2012: 1384-1392.

    微软; 实体同义词 (entity synonyms); 基于点击数据; 垂类搜索领域 (电商/视频); 如何在垂搜中使用同义词; 提出 Pseudo Document Similarity (PseudoDocSim, 改进 ClickSim 和 DocSim) 和 Query Context Similarit (QCSim, 弥补 ClickSim 和 DocSim 的缺陷) 两种相似度计算方法;

资源

Last updated