Date of Award
2025
Type
Thesis
Major
Computer Science - Applied Computing Track
Degree Type
Master of Science in Applied Computer Science
Department
TSYS School of Computer Science
First Advisor
Dr. Rania Hodhod
Second Advisor
Dr. Lydia Ray
Third Advisor
Dr. Riduan Abid
Abstract
This thesis presents a study on the fine-tuning of large language models (LLMs) for domain-specific applications using limited data. We fine-tuned the LLaMA-2 (7B) model on a curated entrepreneurial dataset containing 3,545 human-written question-answer pairs, of which 3,095 were used for training and 450 were reserved for evaluation. A complete fine-tuning and evaluation pipeline was developed, which included clustering human-written answers, generating centroid-based summaries for each cluster, and evaluating the model's generated responses through cosine similarity.
Training was carried out over five epochs, with model performance evaluated after each epoch. The fine-tuned model demonstrated strong semantic alignment with human-written content, producing coherent, detailed, and contextually appropriate answers tailored to entrepreneurial questions. Through structured clustering and summary-based evaluation, we established a reliable framework for assessing model quality in specialized domains. We evaluated model performance based on training loss and time. In the first experiment (LoRA parameters only), the fastest epoch took 402.7 seconds with a loss of 1.7690; the lowest loss was 1.7513 in 407 seconds. In the second experiment (LoRA + LLaMA-2), the shortest epoch was 399.8 seconds with a loss of 1.9587; the lowest loss was 1.7576 in 404 seconds. After fine-tuning, the model achieved a cosine similarity of 0.4744 on the original 25 questions. For five paraphrased questions, scores ranged from 0.4949 to 0.6606. For three general questions, the scores were 0.3409, 0.5294, and 0.401.
This work successfully demonstrates that large language models can be effectively adapted to niche areas even when working with modest-sized datasets. The results highlight the potential of structured fine-tuning techniques combined with efficient evaluation strategies to achieve high-quality domain adaptation, contributing to the broader goal of making large models more accessible and specialized for practical real-world applications
Recommended Citation
Saha, Gaurob, "QLoRaX: Heuristic-Guided Fine-Tuning of LLaMA-2 for Domain Adaptation in Entrepreneurship" (2025). Theses and Dissertations. 575.
https://csuepress.columbusstate.edu/theses_dissertations/575