Research
Linguistic Diversity and Model Collapse in Large Language Models
Blacksburg, US · Feb 2026 – Present
Supervisors: Alan Wang, Wenqi Shen
- Framed the study as design-science research, grounding it in an encoding-variability mechanism that links the linguistic diversity of training text to a model's robustness against collapse under recursive training.
- Designed a Writeprints linguistic-diversity framework that isolates which dimensions of training text drive the deepest diversity loss; the longer-term aim is to guide data interventions that reduce both diversity loss and long-tail knowledge loss.
- Implemented a Writeprints feature pipeline in Python computing 19 lexical, syntactic, structural, and content dimensions per document, so each diversity construct can be measured and disentangled separately.
- Fine-tuned a Llama-3.2-1B model with LoRA across five generations of recursive training on Wikipedia corpora, reproducing the iterative self-training loop that drives model collapse.
The Null Hypothesis at 100: NHST as an Epistemic Institution in Management Research
Blacksburg, US · Feb 2026 – Present
Supervisors: Richard A. Hunt, Joseph Simpson
- Conducted a science-of-science integrative review of how NHST (Null Hypothesis Significance Testing) became an institutionalized epistemic rule in management research rather than a purely statistical technique.
- Designed a reproducible search and screening protocol on Web of Science in Python; predefined inclusion criteria narrowed 492 candidate articles from FT50 and leading review journals to a final sample of 165 peer-reviewed papers.
- Scored each article's stance toward NHST through LLM-based polarity analysis and mapped papers onto management sub-fields; built the sentiment heat maps and publication-trend figures in the review.
- Proposed one of the review's six core insights: the risk of epistemic inversion under generative AI, where AI workflows can reinforce p-hacking by reasoning backward from detectable patterns to seemingly theory-driven claims.
The Impact of Generative AI Models on Consumer Purchase Behavior in E-Commerce
Xi'an, China · Jul 2024 – Jun 2025
Supervisors: Rong Du, David M. Townsend
- Used a quasi-natural experiment with Difference-in-Differences (DID) to study the impact of AI applications (recommendation, review summarization, knowledge Q&A) on consumer purchasing behavior and perceived quality.
- Developed web-scraping solutions with the Scrapy framework, extracting more than 100,000 user reviews.
- Conducted sentiment analysis with LLM APIs and NLP techniques (word2vec, BERT) and linguistic analysis with LIWC to identify key features in user reviews.
- Cleaned and processed over 1 million transaction records with ETL pipelines using NumPy, Pandas, and DuckDB.
- Theoretically, first to combine cue-utilization theory with generative AI in e-commerce, analyzing long-tail vs. winner-take-all dynamics; introduced Baidu Index as an objective brand-awareness metric with real sales data.
A Cognitive Rules-based Framework for AIGC Trustworthiness
Xi'an, China · Jan 2024 – Oct 2025
Supervisors: Rong Du, Richard A. Hunt
- Pioneered a diagnostic framework for AI-Generated Content (AIGC) trustworthiness by integrating Nomology theory with a dual cognitive perspective — a multi-level, nine-dimension framework built via Grounded Theory and validated through large-scale text mining.
- Conducted semi-structured interviews with over 20 users, including domestic and international undergraduates, doctoral students, and industry professionals.
- Performed text mining and sentiment analysis on social-media data using topic modeling (e.g., BERTopic) to validate the framework and quantify trust across the nine dimensions.
- Diagnosed a core "representation–substance imbalance": positive perceptions of AIGC's surface experience are undermined by distrust in its core substance (information quality, transparency).
The Impact of Traceability Information on Consumer Purchase Behavior in E-Commerce
Xi'an, China · Oct 2023 – Present
Supervisors: Rong Du, Andrew Burton-Jones
- Used a quasi-natural experiment with DID to study how product traceability information (traditional and blockchain-based) affects consumer purchase behavior, with brand reputation and eWOM as moderators.
- Applied LDA topic modeling and text mining to 14,000 product reviews, constructing an attention index for user topic focus.
Education
Xidian University & Virginia Tech
Xi'an & Blacksburg · Sep 2022 – May 2026
Rank 1 / 93
-
B.Mgmt. in Big Data Management and Application (Xidian) — GPA 3.9 / 4.0
Minor: Artificial Intelligence & Large Language Model Application (Xidian)
-
B.S. in Business, Management: Entrepreneurship, Innovation &
Technology Management (Virginia Tech) — GPA 3.78 / 4.0
Selected core courses
- Business Intelligence Analysis (Data Mining) — A−
- Introduction to Data Science — A
- Management Information System — A−
- Database Principle and Application — A
- Business Statistics, Analytics & Modeling — A
- Statistical Machine Learning & Text Mining — A
- Natural Language Processing — A
- Principles of Large Models and Industry Applications — A