面向多義詞例句語(yǔ)料生成的大模型微調(diào)指令自動(dòng)化生成框架
Abstract:First,a manual instruction setcontaining a body description set and a list of instruction examples is constructed as the initial input for the instruction pool.Then,input the instructions from the instruction pool into the large model to generate a number of machine-generated instructions corresponding to their corpora,the generated corpora are refined with text correction to obtain the desired polysemy example sentence corpus. Finaly,the edit distance algorithm is used to remove the weight of machine instructions,and the spectral clustering algorithm is used to cluster the candidate machine instructions,thereby achieving automated generation of machine instructions.By updating the instruction pool, iterative generation of the polysemy example sentence corpus is realized. The results show that the constructed polysemy example sentence dataset and its corresponding large model machine instruction set exhibit good linguistic diversity and content diversity. The constructed polysemy example sentence dataset meets the needs of second language learners in terms of sentence length,sentiment,vocabulary difficulty standard level ,and topics. Keywords:large language model; instruction generation; polysemy; example sentence generation; ChatGPT
中文作為一種復(fù)雜的語(yǔ)言,具有豐富的多義詞現(xiàn)象,即一個(gè)字或一個(gè)詞有多個(gè)不同的意義。(剩余11760字)
-
-
- 華僑大學(xué)學(xué)報(bào)(自然科學(xué)版)
- 2025年03期
- 壞死性凋亡在病毒感染性疾病中的...
- 裝載機(jī)工作裝置運(yùn)動(dòng)仿真及其關(guān)鍵...
- 改進(jìn)YOLOv8n模型的火災(zāi)場(chǎng)...
- 基坑開(kāi)挖引起地下連續(xù)墻水平位移...
- 溫度作用下高精電子廠(chǎng)房開(kāi)洞華夫...
- MEI1 基因在宮頸癌組織中的...
- 蝦青素微囊粉的制備與質(zhì)量評(píng)價(jià)及...
- 密鑰驅(qū)動(dòng)下多語(yǔ)義維度結(jié)構(gòu)化成績(jī)...
- 基于改進(jìn)圖卷積網(wǎng)絡(luò)和人體骨架的...
- 福建省大數(shù)據(jù)產(chǎn)業(yè)時(shí)空演化與驅(qū)動(dòng)...
- 面向多義詞例句語(yǔ)料生成的大模型...
- 位置感知及背景掃描下軟件定義車(chē)...
- Gray-Scott模型的高階...
- 經(jīng)典的Drinfel'...