The SPECIALIST Lexicon API

2023-10-28 03:18
文章标签 api specialist lexicon

本文主要是介绍The SPECIALIST Lexicon API,希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!

The SPECIALIST Lexicon JAVA API使用

affix 为词缀,按缀位分为 prefix (前缀)和 suffix(后缀);
按缀形分成 inflection (屈折词)和 derivation (衍生词)
derivation 分为 prefix 和 suffix,如:happy 加suffix为happily,加prefix为unhappy.
inflection 只在词尾加词缀,表时态,数,格等变化,如:ask,asks,asking,asked,etc.

derivation 派生词 改变词性和语义

inflection 语法变化

LvgCmdApi

全部组件说明 lvg2021/docs/designDoc/UDF/flow/index.html

-f:a

缩写扩展

-f:b

uninflect a term 还原单词形态

it can make plural nouns in to singular nouns, inflected verbs into their infinitive forms, and adjectives and adverbs into their positive forms.

复数转换成单数,动词转换成不定式,副词形容词转换成原级(不能转换成名词)

-f:An 

Anti-Normalize (Approximate Match)

‎使用规范化术语作为输入返回词汇中的转换后的术语。可用作基本近似匹配。

‎在词典中找到近似匹配,可用于不规范术语转换

The order of the results is sorted by alphabetical, EUI, category, and then inflection.

String outputFromLvg = null;
LvgCmdApi lvgApi = new LvgCmdApi("-f:An", "D:/lvg2021/data/config/lvg.properties");// ---------------------------------       
// process each term
// ---------------------------------
outputFromLvg = lvgApi.MutateToString("term");

-f:d

Generate derivational variants

生成派生词

派生规则文件 lvg2021/docs/designDoc/UDF/derivations/index.html

Derivational variants are generated by FACTs (a pre-computed derivational table) and morphology rules (RULEs). Facts are stored in database and retrieved by SQL query. RULEs are stored and retrieved through Trie mechanism.

派生转换由FACT(预计算的派生表)和形态规则(RULEs)生成。FACTs存储在数据库中,由SQL查询检索。RULEs通过Trie机制存储和检索。

-f:dc~数字

以数字指定派生词词性

CategoryValue
adj1
adv2
aux4
compl8
conj16
det32
modal64
noun128
prep256
pron512
verb

1024

String outputFromLvg = null;
LvgCmdApi lvgApi = new LvgCmdApi("-f:dc~128", "D:/lvg2021/data/config/lvg.properties");
outputFromLvg = lvgApi.MutateToString(w);
String[] outs = outputFromLvg.split("\n");
if (outputFromLvg.length()>0) {for (String out : outs) {derivword.add(out.split("\\|")[1]);}}

-f:d kdt:STR

限制派生类型

  • Z (zeroD): restricts the outputs zero derivations of the input.无变化
  • S (suffixD): restricts the outputs suffix derivations of the input. 后缀
  • P (prefixD): restricts the outputs prefix derivations of the input. 前缀
  • ZS (zeroD and suffixD): restricts the outputs zero and suffix derivations of the input. This is one of the most used options with query expansion for CUI mapping. 
  • ZP (zeroD and prefixD): restricts the outputs zero and prefix derivations of the input.
  • SP (suffixD and prefixD): restricts the outputs suffix and prefix derivations of the input.
  • ZSP (all): No restriction on the outputs on derivation type. All zeroD (Z), suffixD (S), and prefixD (P) are displayed. This is the default option.

-f:f

 Filter output to contain only forms from the lexicon.

过滤词典中不存在的,只返回一条记录

inflection输出过滤 -k:i:1 

输出派生变体过滤 -k:d:1

-f:i

Generate inflectional variants

生成屈折变体

-f:Ln

从数据库中检索单词类别(词性)和变体信息

-f:nom

Retrieve nominalizations form for an input term.

输入的标准化形式

-f:N3

=LuiNorm?

normalize non-ASCII Unicode characters to ASCII, remove genitives, then remove parenthetic plural forms, then replace punctuations with spaces, then remove stop words, then lowercase, then uninflected words, then take each of the normalized uninflected words and map them to their canonical form, then strip or map non-ASCII Unicode characters to ASCII, and then word order sort.

非ASCII字符转换,删除所有格,删除括号复数,替换标点符号为空格,小写,词形还原,转为正式名称,排序单词

-f:r

递归生成同义词

Norm API

lvg2021/docs/userDoc/examples/norm.html

同 -f:q0:g:rs:o:t:l:B:Ct:q7:q8:w

  1. q0: map Unicode symbols and punctuation to ASCII
  2. g: remove genitives,
  3. rs: then remove parenthetic plural forms of (s), (es), (ies), (S), (ES), and (IES),
  4. o: then replace punctuation with spaces,
  5. t: then remove stop words,
  6. l: then lowercase,
  7. B: then uninflect each word,
  8. Ct: then get citation form for each base form,
  9. q7: then Unicode Core Norm
  10. q8: then strip or map non-ASCII Unicode characters,
  11. w: and finally sort the words in alphabetic order.

生成的单词有可能不存在于词典中

right经norm后成ride 

import java.util.*;
import gov.nih.nlm.nls.lvg.Api.*;public class Normalization
{// test driverpublic static void main(String[] args){// instantiate a LvgApi object by config fileString lvgConfigFile= "/export/home/lu/Projects/LVG/lvg2012/data/config/lvg.properties";NormApi normApi = new NormApi(lvgConfigFile);// Process the inflectional variants mutationString in = "left"; // use lexItem as input to lvgApitry{Vector outs = normApi.Mutate(in);// PrintOut the Resultfor(String out: outs){System.out.println(in + "|" + out);}// clean upnormApi.CleanUp();}catch (Exception e){System.err.println("** ERR: " + e.toString());}}
}

输出形式

Field 1Field 2Field 3Field 4Field 5Field 6Field 7+
InputOutput TermCategoriesInflectionsFlow HistoryFlow NumberAdditional Information

output term:转换后的术语

categories:

BitValueVariantOther SymbolsExample
01adj
  • adjective
  • ADJ
  • red
12adv
  • adverb
  • ADV
  • quickly
24aux
  • auxiliary
  • be
  • is
  • are
  • do
  • have
  • has
38compl
  • complementizer
  • that
416conj
  • conjunction
  • CON
  • con
  • and
  • or
  • but
532det
  • determiner
  • DET
  • a
  • the
  • some
  • each
664modal.
  • can
  • dare
  • may
  • must
  • ought
  • shall
  • will
7128noun
  • NOM
  • NPR
  • dog
8256prep
  • preposition
  • PRE
  • pre
  • to
  • on
  • in
  • at
  • by
9512pron
  • pronoun
  • it
  • he
  • they
101024verb
  • VER
  • ver
  • break

inflection:

 

BitValueVariantOther SymbolsExample
01base.
  • dog
  • break
  • red
  • quickly
12comparative 比较级.
  • redder
24superlative 最高级.
  • reddest
38plural 复数
  • p
  • dogs
416presPart 现在分词
  • ing
  • breaking
532past 过去式.
  • broke
664pastPart  过去分词.
  • broken
7128pres3s 现在第三人称单数.
  • breaks
8256positive.
  • red
9512singular
  • s
  • dog
101024infinitive
  • inf
  • break
112048pres123p.
  • break
124096pastNeg.
  • didn't
  • couldn't
  • wouldn't
  • shouldn't
138192pres123pNeg.
  • don't
  • won't
1416384pres1s.
  • am
1532768past1p23pNeg.
  • weren't
1665536past1p23p.
  • were
17131072past1s3sNeg.
  • wasn't
18262144pres1p23p.
  • are
19524288pres1p23pNeg.
  • aren't
201048576past1s3s.
  • was
212097152pres.
  • can
224194304pres3sNeg.
  • isn't
  • hasn't
238388608presNeg.
  • can't
  • cannot

where:

  • pres: present
  • past: past
  • Part: participle
  • 1: first personal
  • 2: second personal
  • 3: third personal
  • s: singular
  • p: plural
  • Neg: Negative

additional information:-m

 

Sub-Term Mapping Tools (SMTM)

Sub-Term Mapping Tools (nih.gov)

LexItem Sub-Term Finder (LSF):

  • to find if a term is in the Lexicon
  • to find all sub-terms are in the Lexicon
  • to find all prefix sub-terms are in the Lexicon
  • to find the longest prefix sub-term in the Lexicon
//判断语料库中是否存在该词
LsfApi lsfApi = new LsfApi("D:/stmt2015/data/config/lsf.properties");String isincorpus = lsfApi.CheckInCorpus("alis");
//前缀? 对于单独的单词好像无法识别前缀
Vector<String> prefixes = lsfApi.FindPrefixes("cricoarytenoid");

 

这篇关于The SPECIALIST Lexicon API的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!



http://www.chinasem.cn/article/290321

相关文章

Java调用DeepSeek API的最佳实践及详细代码示例

《Java调用DeepSeekAPI的最佳实践及详细代码示例》:本文主要介绍如何使用Java调用DeepSeekAPI,包括获取API密钥、添加HTTP客户端依赖、创建HTTP请求、处理响应、... 目录1. 获取API密钥2. 添加HTTP客户端依赖3. 创建HTTP请求4. 处理响应5. 错误处理6.

Deepseek R1模型本地化部署+API接口调用详细教程(释放AI生产力)

《DeepseekR1模型本地化部署+API接口调用详细教程(释放AI生产力)》本文介绍了本地部署DeepSeekR1模型和通过API调用将其集成到VSCode中的过程,作者详细步骤展示了如何下载和... 目录前言一、deepseek R1模型与chatGPT o1系列模型对比二、本地部署步骤1.安装oll

浅析如何使用Swagger生成带权限控制的API文档

《浅析如何使用Swagger生成带权限控制的API文档》当涉及到权限控制时,如何生成既安全又详细的API文档就成了一个关键问题,所以这篇文章小编就来和大家好好聊聊如何用Swagger来生成带有... 目录准备工作配置 Swagger权限控制给 API 加上权限注解查看文档注意事项在咱们的开发工作里,API

一分钟带你上手Python调用DeepSeek的API

《一分钟带你上手Python调用DeepSeek的API》最近DeepSeek非常火,作为一枚对前言技术非常关注的程序员来说,自然都想对接DeepSeek的API来体验一把,下面小编就来为大家介绍一下... 目录前言免费体验API-Key申请首次调用API基本概念最小单元推理模型智能体自定义界面总结前言最

JAVA调用Deepseek的api完成基本对话简单代码示例

《JAVA调用Deepseek的api完成基本对话简单代码示例》:本文主要介绍JAVA调用Deepseek的api完成基本对话的相关资料,文中详细讲解了如何获取DeepSeekAPI密钥、添加H... 获取API密钥首先,从DeepSeek平台获取API密钥,用于身份验证。添加HTTP客户端依赖使用Jav

C#使用DeepSeek API实现自然语言处理,文本分类和情感分析

《C#使用DeepSeekAPI实现自然语言处理,文本分类和情感分析》在C#中使用DeepSeekAPI可以实现多种功能,例如自然语言处理、文本分类、情感分析等,本文主要为大家介绍了具体实现步骤,... 目录准备工作文本生成文本分类问答系统代码生成翻译功能文本摘要文本校对图像描述生成总结在C#中使用Deep

5分钟获取deepseek api并搭建简易问答应用

《5分钟获取deepseekapi并搭建简易问答应用》本文主要介绍了5分钟获取deepseekapi并搭建简易问答应用,文中通过示例代码介绍的非常详细,对大家的学习或者工作具有一定的参考学习价值,需... 目录1、获取api2、获取base_url和chat_model3、配置模型参数方法一:终端中临时将加

使用DeepSeek API 结合VSCode提升开发效率

《使用DeepSeekAPI结合VSCode提升开发效率》:本文主要介绍DeepSeekAPI与VisualStudioCode(VSCode)结合使用,以提升软件开发效率,具有一定的参考价值... 目录引言准备工作安装必要的 VSCode 扩展配置 DeepSeek API1. 创建 API 请求文件2.

使用SpringBoot创建一个RESTful API的详细步骤

《使用SpringBoot创建一个RESTfulAPI的详细步骤》使用Java的SpringBoot创建RESTfulAPI可以满足多种开发场景,它提供了快速开发、易于配置、可扩展、可维护的优点,尤... 目录一、创建 Spring Boot 项目二、创建控制器类(Controller Class)三、运行

【LabVIEW学习篇 - 21】:DLL与API的调用

文章目录 DLL与API调用DLLAPIDLL的调用 DLL与API调用 LabVIEW虽然已经足够强大,但不同的语言在不同领域都有着自己的优势,为了强强联合,LabVIEW提供了强大的外部程序接口能力,包括DLL、CIN(C语言接口)、ActiveX、.NET、MATLAB等等。通过DLL可以使用户很方便地调用C、C++、C#、VB等编程语言写的程序以及windows自带的大