FAID

Fine-Grained AI-Generated Text Detection Using Multi-Task Auxiliary and Multi-Level Contrastive Learning

1BKAI Research Center, Hanoi University of Science and Technology, 2MBZUAI, 3INSAIT, Sofia University "St. Kliment Ohridski"

Abstract

The growing collaboration between humans and LLMs in generative tasks has introduced new challenges in distinguishing between human-written, LLM-generated, and human--LLM collaborative texts. In this work, we collect a multilingual, multi-domain, multi-generator dataset FAIDSet. We further introduce a fine-grained detection framework, FAID, to classify text into these three categories and to identify the underlying LLM family of the generator. Unlike existing binary classifiers, FAID is built to capture both authorship and model-specific characteristics. Our method combines multi-level contrastive learning with multi-task auxiliary classification to learn subtle stylistic cues. By modeling LLM families as distinct stylistic entities, we adapt to address distributional shifts without retraining on unseen data. Our results demonstrate that FAID outperforms several baselines, particularly improving generalization accuracy across unseen domains and new LLMs, offering a potential solution to improve transparency and accountability in AI-assisted writing.

FAIDSet

FAIDSet is a multilingual, multi-domain dataset for fine-grained AI-generated text detection, comprising 83k+ examples across three authorship categories: human-written, LLM-generated, and human-LLM collaborative texts. It covers academic writing in English and Vietnamese, including paper abstracts and student theses, and spans multiple LLM families such as GPT, Gemini, Llama, and DeepSeek.

Designed to support robust and generalizable detection, FAIDSet captures diverse collaboration patterns and serves as both a training resource and a benchmark for evaluating performance under unseen domains and generators.

Subset Human LLM Human-LLM collab.
Train 14,176 12,076 32,091
Validation 3,038 2,588 6,876
Test 3,038 2,588 6,879
Total 20,252 17,252 45,846
Table: Number of examples per label in subsets.
Source Human Texts
arXiv abstracts 2,000
VJOL abstracts 2,195
HUST theses (English) 4,898
HUST theses (Vietnamese) 11,159
Table: Statistics of human-written text's origins.

FAID

Training Architecture

FAID Training Architecture

Leveraging multi-level contrastive learning loss, we fine-tune a language model (we select XLM-RoBERTa) based on the human, human-LLM and LLM-generated texts,to force the model to reorganize the hidden space, pulling the embeddings within the same author families closer, and pushing the embeddings from different authors farther. We train an encoder that can represent text with distinguishable signals to discern the authorship of text.

Inference Architecture

FAID Inference Architecture
  • Embed the input text into an embedding vector using the fine-tuned encoder.
  • Use Fuzzy kNN to cluster the embedding, retrieving the cluster that the input text belongs to.
  • The stored vector database VD is constructed by saving all embeddings of texts from the training and validation sets using the fine-tuned encoder. If the input text is unseen, it is embedded and stored in a temporary vector database VD', enhancing the detector’s generalization.

Main Results

Tables 1 and 2 summarize the three-label classification results (human-written vs. LLM-generated vs. human-LLM collaborative) for FAID and three baselines under both in-domain and out-of-domain settings. FAID consistently achieves the strongest performance, while SeqXGPT degrades notably on newer, more fluent generators.

Table 1: Performance on Known Domains and Generators

Dataset Detector Accuracy ↑ Precision ↑ Recall ↑ F1-macro ↑ MSE ↓ MAE ↓
FAIDSet LLM-DetectAIve94.3494.4593.7994.100.18880.1107
T5-Sentinel93.3194.9293.1093.150.21040.1101
SeqXGPT85.7785.4986.0284.690.55930.2844
FAID95.5895.7895.3395.540.17190.0875
LLM-DetectAIve LLM-DetectAIve95.7195.7895.7295.710.16060.1314
T5-Sentinel94.7794.7092.6093.600.16630.1503
SeqXGPT81.4878.7274.9176.710.31410.2255
FAID96.9995.2988.1491.580.15610.0754
HART LLM-DetectAIve94.3994.2594.3394.290.32440.1789
T5-Sentinel86.6887.2587.6987.380.43390.2334
SeqXGPT63.1264.0165.2764.051.00570.5982
FAID96.7397.6198.0597.800.46310.1806

FAID consistently achieves the best overall accuracy and F1-macro across all in-domain benchmarks, including FAIDSet, LLM-DetectAIve, and HART. In comparison, LLM-DetectAIve and T5-Sentinel perform competitively but fall short of FAID, while SeqXGPT shows substantial degradation, particularly on datasets containing more advanced and human-like LLM outputs.


Table 2: Performance with Unseen Data

Setting Detector Accuracy ↑ Precision ↑ Recall ↑ F1-macro ↑ MSE ↓ MAE ↓
Unseen domain LLM-DetectAIve52.8347.3164.6253.280.47330.4722
T5-Sentinel55.5649.5466.6755.340.44440.4444
SeqXGPT40.6043.8131.8736.720.80210.7028
FAID62.7870.7371.7769.460.45140.4486
Unseen generators LLM-DetectAIve75.7173.2575.6374.300.37140.2957
T5-Sentinel85.9585.7784.5985.160.36480.2419
SeqXGPT72.0460.3348.9454.120.45900.3380
FAID93.3192.4094.4493.250.16910.1167
Unseen domain
& generators
LLM-DetectAIve62.9366.7471.1761.970.44790.3964
T5-Sentinel57.0749.8266.6155.450.43140.4300
SeqXGPT40.7147.9535.2140.090.87530.7086
FAID66.5574.4473.5772.580.39390.3167

FAID clearly outperforms all baselines on unseen generators, reaching 93.31% accuracy, and also provides the strongest results when both domains and generators are unseen. Generalizing to unseen domains remains challenging for all methods, but FAID still delivers the highest accuracy at 62.78%, demonstrating improved robustness over competing detectors.


Overall, the results indicate that FAID effectively captures generalizable stylistic signals tied to LLM families, enabling superior performance across both in-domain and out-of-domain scenarios without overfitting to surface-level artifacts.

BibTeX

@misc{ta-etal-2025-faid,
  title         =  {{FAID}: {F}ine-Grained {AI}-Generated Text Detection Using Multi-Task Auxiliary and Multi-Level Contrastive Learning}, 
  author        =  {Minh Ngoc Ta and Dong Cao Van and Duc-Anh Hoang and Minh Le-Anh and Truong Nguyen and My Anh Tran Nguyen and Yuxia Wang and Preslav Nakov and Sang Dinh},
  year          =  {2025},
  eprint        =  {2505.14271},
  archivePrefix =  {arXiv},
  journal       =  {ArXiv preprint},
  volume        =  {arXiv:2505.14271},
  primaryClass  =  {cs.CL},
  url           =  {https://arxiv.org/abs/2505.14271}, 
}