Method Article

Two-Stage Recruitment Text Labeling via Lexicon-Guided Routing and Retrieval-Augmented Large Language Models

DOI:

10.3791/70153

May 8th, 2026

In This Article

Summary

Loading...
$$\rightleftharpoonup{xx}$$ $$\longleftharp{xx}$$, $$\longrightharp{xx}$$,

A two-stage workflow is described for classifying recruitment texts into Artificial Intelligence, Environmental Protection, or Other. Lexicon-guided triage assigns routine cases, while retrieval-augmented, prompt-optimized large language model inference resolves ambiguous cases, enabling accurate large-scale labeling and downstream descriptive labor-market summarization.

Abstract

Loading...
$$\rightleftharpoonup{xx}$$ $$\longleftharp{xx}$$, $$\longrightharp{xx}$$,

Recruitment texts are heterogeneous, and domain terminology evolves over time, making large-scale labeling difficult with limited manual annotation capacity. This protocol provides a reproducible two-stage workflow for classifying Chinese recruitment texts into Artificial Intelligence, Environmental Protection, or Other. Stage one performs lexicon-guided triage using two curated domain keyword lists. Postings that match only one domain lexicon are directly labeled. Postings that match neither lexicon are labeled as Other, while dual-match postings are routed to stage two. Stage two applies retrieval-augmented large language model inference. A knowledge base compiled from 12 authoritative domain documents is converted to text, split into about 1,000-character chunks with 20-character overlap, embedded using all-MiniLM-L6-v2, and indexed in LanceDB. For each uncertain posting, cosine-similarity k-nearest-neighbor retrieval returns candidate chunks filtered by a minimum similarity threshold (τ), and up to four chunks are injected into a fixed prompt template that defines label boundaries and constrains the output to a single label. A benchmark set of 3,000 postings is manually verified, with 1,000 postings per class, and achieves an inter-annotator agreement of 95.0 % and a Cohen's kappa (κ) of about 0.93. Evaluation compares multiple open-source large language models across a prompt-only setting and a retrieval-augmented setting using Qwen2.5-7B-Instruct. The retrieval-augmented configuration achieves 98.47 %accuracy and supports corpus-scale labeling of about 6.93 million postings for downstream descriptive aggregation.

Introduction

Loading...
$$\rightleftharpoonup{xx}$$ $$\longleftharp{xx}$$, $$\longrightharp{xx}$$,

In recent years, with the rapid development of AI and environmental technologies, a large number of emerging careers have emerged. On the one hand, the global job market is flooded with many brand-new positions based on AI and sustainability. Vacancy-based evidence documents a rapid expansion in AI-related skill demand, which increases the heterogeneity of recruitment texts and motivates a reproducible, scalable domain-tagging protocol1. On the other hand, China's latest revision of the Occupational Classification Dictionary (OCD) has seen a similarly robust growth in new occupations in the fields of employment informatization and green and....

Access restricted. Please log in or start a trial to view this content.

Protocol

Loading...
$$\rightleftharpoonup{xx}$$ $$\longleftharp{xx}$$, $$\longrightharp{xx}$$,

This study did not involve human participants or animal subjects. Therefore, ethical approval and informed consent were not required. The workflow was executed in a Python 3.10 environment on a 64-bit Windows operating system using the hardware, tools, and software listed in the Table of Materials.

1. Modeling

  1. Constructing the data and knowledge base
    1. Collect and curate an in-house structured corpus of job advertisements for publicly listed companies from major online recruitment platforms in China (2014–2023), ensuring compliance with platform policies and applicable regulation....

Access restricted. Please log in or start a trial to view this content.

Results

Loading...
$$\rightleftharpoonup{xx}$$ $$\longleftharp{xx}$$, $$\longrightharp{xx}$$,

Four open-source, instruction-tuned large language model backbones (Backbone A–D) listed in the Table of Materials are benchmarked on the three-class job-posting classification task. Evaluation is conducted using the same test protocol, preprocessing procedures, and metrics across all backbones, including overall accuracy, precision, recall, and F1. Prompt refinement and retrieval-augmented generation (RAG) are subsequently applied to Backbone B and re-evaluated under identical conditions. The performanc.......

Access restricted. Please log in or start a trial to view this content.

Discussion

Loading...
$$\rightleftharpoonup{xx}$$ $$\longleftharp{xx}$$, $$\longrightharp{xx}$$,

Prior work has applied classical machine learning and neural models to recruitment text categorization, including resume and job description classification, but such approaches often require substantial labeled data and may degrade when domain terminology shifts. Ali et al. proposed a resume classification system using NLP and machine learning techniques18. Jalili et al. explored BiLSTM-based resume classification19. Pal et al. evaluated resume categorization using classica.......

Access restricted. Please log in or start a trial to view this content.

Disclosures

Loading...
$$\rightleftharpoonup{xx}$$ $$\longleftharp{xx}$$, $$\longrightharp{xx}$$,

The authors have nothing to disclose.

Acknowledgements

Loading...
$$\rightleftharpoonup{xx}$$ $$\longleftharp{xx}$$, $$\longrightharp{xx}$$,

The authors thank colleagues for technical support and constructive feedback during data curation and manuscript preparation. This project received no external funding.

....

Access restricted. Please log in or start a trial to view this content.

Materials

List of materials used in this article
NameCompanyCatalog NumberComments
all-MiniLM-L6-v2 (sentence encoder)Hugging Face (sentence-transformers)https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2Sentence embedding model used for vectorization.
bert-base-chinese (baseline encoder)Hugging Face (google-bert)https://huggingface.co/google-bert/bert-base-chineseBaseline encoder (Chinese BERT base).
DDR5 memory, 16 GBWorkstation/laptop configurationN/ASystem memory (16 GB DDR5); vendor/part number not specified.
DeepSeek-R1-Distill-Qwen-7B (Backbone A)Hugging Face (deepseek-ai)https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-7BLLM backbone A.
GLM-4-9B-Chat (Backbone C)Hugging Face (zai-org)https://huggingface.co/zai-org/glm-4-9b-chat-hfLLM backbone C.
Intel Core i5-13500HX CPUIntelhttps://www.intel.com/content/www/us/en/products/sku/232156/intel-core-i513500hx-processor-24m-cache-up-to-4-70-ghz/specifications.htmlWorkstation CPU used for experiments.
InternLM2.5-7B-Chat (Backbone D)Hugging Face (internlm)https://huggingface.co/internlm/internlm2_5-7b-chatLLM backbone D.
jieba (Chinese tokenizer)PyPI (jieba)https://pypi.org/project/jieba/Chinese tokenization/segmentation library; version not specified.
Keyword libraries (AI & Environmental) (Supplement File 3)This studySupplement File 3Domain keyword lexicons used for lexicon-guided pre-filtering; archive exact version used per run.
LanceDB (vector database)LanceDBN/AVector database for storing/searching embeddings; version not specified.
NVIDIA GeForce RTX 4060 Laptop GPU (8 GB VRAM)NVIDIAhttps://www.nvidia.com/en-us/geforce/laptops/40-series/GPU used for inference/experiments.
Online recruitment platform job postings (2014–2023)Major Chinese recruitment platforms (public web data)N/APublicly posted job advertisements collected and curated by the authors; ~6.93 million postings after cleaning.
pip (Python package installer)PyPAN/APackage installer used to install dependencies and export an environment snapshot using pip freeze.
Prompt template evolution (Supplement File 2)This studySupplement File 2Successive prompt variants and the final prompt used for evaluation.
Python 3.10Python Software FoundationN/ARuntime interpreter (Python 3.10); patch version not specified.
Qwen2.5-7B-Instruct (Backbone B)Hugging Face (Qwen)https://huggingface.co/Qwen/Qwen2.5-7B-InstructLLM backbone B.
RAG knowledge base field descriptions/index (Supplement File 1)This studySupplement File 1Schema/field notes and document index for the knowledge base used in retrieval-augmented inference.
Thesaurus (AI & Environmental) (Supplement File 4)This studySupplement File 4Term dictionaries used in reclassification and analysis; archive exact version used per run.
Windows 11 (64-bit)MicrosoftN/AOperating system (Windows 11, 64-bit); build/version not specified.

References

Loading...
$$\rightleftharpoonup{xx}$$ $$\longleftharp{xx}$$, $$\longrightharp{xx}$$,
  1. Alekseeva, L., et al. The demand for AI skills in the labor market. Labour Economics. 71, 102002(2021).
  2. Ministry updates official dictionary of careers. , Ministry of Human Resources and Social Security of the People’s Republic of China. Available at: https://english.www.gov.cn/statecouncil/ministries/202209/30/content_WS633647bbc6d0a75....

Access restricted. Please log in or start a trial to view this content.

Reprints and Permissions

Request permission to reuse the text or figures of this JoVE article

Request Permission

Tags

Recruitment Text LabelingLexicon Guided RoutingRetrieval Augmented ModelsLarge Language ModelsDomain Keyword ListsKnowledge Base ConstructionCosine Similarity RetrievalK Nearest NeighborChinese Text ClassificationInter Annotator Agreement

Related Articles