preprints_ui: mx83s_v2

Denormalized preprint data with contributors and subjects for efficient UI access

Data license: ODbL (database) & original licenses (content) · Data source: Open Science Framework

id	title	description	date_created	date_modified	date_published	original_publication_date	publication_doi	provider	is_published	reviews_state	version	is_latest_version	preprint_doi	license	tags_list	tags_data	contributors_list	contributors_data	first_author	subjects_list	subjects_data	download_url	has_coi	conflict_of_interest_statement	has_data_links	has_prereg_links	prereg_links	prereg_link_info	last_updated
mx83s_v2	Investigating Knowledge Graphs as Structured External Memory to Enhance Large Language Models’ Generation for Mathematical Concept Answering	Interactive question-answering (QA) with a tutor is an effective learning method for middle school math students. The flexibility and emergent capabilities of generative large language models (LLMs) have sparked significant interest in automating this process to deepen conceptual understanding of mathematical learning. However, the issue of LLM hallucination has been well-documented in the research community. LLMs' responses to math problems can be inaccurate or misaligned with educational contexts, such as school curricula. A potential mitigating solution is retrieval-augmented generation (RAG), which involves integrating external knowledge sources into LLM prompts to enhance response quality. Although traditional RAG with vector embeddings has shown promise in improving generation quality, there are limitations regarding its information retrieval scalability, granularity, and explainability. One potential method to further improve traditional RAG is using a knowledge graph (KG). In this paper, we design a method to construct a KG with LLM, retrieve contextual prompts from high-quality math teaching resources via the KG, and then enhance the prompts to generate answers to real student questions about mathematical concepts. To evaluate the effect of retrieved information in RAG, we designed three levels of guidance prompts corresponding to three degrees of groundedness: high, low, and none. We evaluate from two perspectives: the semantic similarity between the retrieved information context and the generated text context, and conduct an experimental study to investigate the impact of our KG-enhanced QA system on student learning gains. The study found that compared to baseline methods, the KG-based RAG approach can enhance the groundedness and faithfulness of both the retrieved documents and the generated context. Additionally, low guidance can improve student learning gains. Our technical framework and analysis results contribute to advancing the application of LLMs in educational technology.	2025-04-07T22:10:22.225800	2025-04-08T02:31:53.650601	2025-04-07T22:35:35.946684			edarxiv	1	accepted	2	1	https://doi.org/10.35542/osf.io/mx83s_v2	Academic Free License (AFL) 3.0		[]	Chenglu Li; Wanli Xing; Hai Li; Wangda Zhu; Bailing Lyu; Zeyu Yan	[{"id": "ap9tz", "name": "Chenglu Li", "index": 0, "orcid": "0000-0002-1782-0457", "bibliographic": true}, {"id": "q53zs", "name": "Wanli Xing", "index": 1, "orcid": null, "bibliographic": true}, {"id": "vauqm", "name": "Hai Li", "index": 2, "orcid": "0009-0004-7299-2042", "bibliographic": true}, {"id": "ptkwe", "name": "Wangda Zhu", "index": 3, "orcid": null, "bibliographic": true}, {"id": "x2adw", "name": "Bailing Lyu", "index": 4, "orcid": null, "bibliographic": true}, {"id": "gtbh6", "name": "Zeyu Yan", "index": 5, "orcid": null, "bibliographic": true}]	Chenglu Li	Education	[{"id": "5d0b8ccabe1a2300167c87f3", "text": "Education"}]	https://osf.io/download/67f44d589fa7994215b2561b	0		no	not_applicable	[]		2025-04-09T21:06:17.243911