CSI Library: Generative AI for Faculty: Collected Resources

STEM

The comparison of general tips for mathematical problem solving generated by generative AI with those generated by human teachers
Jia, J., Wang, T., Zhang, Y., & Wang, G. (2024). The comparison of general tips for mathematical problem solving generated by generative AI with those generated by human teachers. Asia Pacific Journal of Education, 44(1), 8–28. https://doi.org/10.1080/02188791.2023.2286920

more... less...

In designing an intelligent tutoring system, a core area of the application of AI in education, tips from the system or virtual tutors are crucial in helping students solve difficult questions in disciplines like mathematics. Traditionally, the manual design of general tips by teachers is time-consuming and error-prone. Generative AI, like ChatGPT, presents a new channel for designing general tips. This study utilized prompt engineering and Chain of Thought to summarize general tips for given mathematical problems (one geometry problem and one algebra problem) and their solutions. A Turing test was conducted to compare ChatGPT-generated general tips with human-designed ones. Results from 121 human evaluators, each assessing 6 ChatGPT-generated and 6 human-designed general tips for each of two mathematical problems, showed that the average score for ChatGPT-generated tips is less than that of human-designed tips at a statistically significant level (p < 0.05), and Zero-Shot CoT achieved the best score. However, no evaluator could distinguish the tip types exactly. The average precision, recall and F-value of all ChatGPT-generated tips are less than 40%. AI-generated general tips can serve as a valuable reference for teachers to enhance efficiency and students' mathematical learning.
Large Language Models for Mathematical Reasoning: Progresses and Challenges
Ahn, J., Verma, R., Renze Lou, Liu, D., Zhang, R., & Yin, W. (2024). Large Language Models for Mathematical Reasoning: Progresses and Challenges. arXiv.Org. https://doi.org/10.48550/arxiv.2402.00157

more... less...

Mathematical reasoning serves as a cornerstone for assessing the fundamental cognitive capabilities of human intelligence. In recent times, there has been a notable surge in the development of Large Language Models (LLMs) geared towards the automated resolution of mathematical problems. However, the landscape of mathematical problem types is vast and varied, with LLM-oriented techniques undergoing evaluation across diverse datasets and settings. This diversity makes it challenging to discern the true advancements and obstacles within this burgeoning field. This survey endeavors to address four pivotal dimensions: i) a comprehensive exploration of the various mathematical problems and their corresponding datasets that have been investigated; ii) an examination of the spectrum of LLM-oriented techniques that have been proposed for mathematical problem-solving; iii) an overview of factors and concerns affecting LLMs in solving math; and iv) an elucidation of the persisting challenges within this domain. To the best of our knowledge, this survey stands as one of the first extensive examinations of the landscape of LLMs in the realm of mathematics, providing a holistic perspective on the current state, accomplishments, and future challenges in this rapidly evolving field.
Learning Mathematics with Large Language Models: A Comparative Study with Computer Algebra Systems and Other Tools
Matzakos, N., Doukakis, S., & Moundridou, M. (2023). Learning Mathematics with Large Language Models: A Comparative Study with Computer Algebra Systems and Other Tools. International Journal of Emerging Technologies in Learning, 18(20), 51–71. https://doi.org/10.3991/ijet.v18i20.42979

more... less...

Artificial intelligence (AI) has permeated all human activities, bringing about significant changes and creating new scientific and ethical challenges. The field of education could not be an exception to this development. OpenAI's unveiling of ChatGPT, their large language model (LLM), has sparked significant interest in the potential applications of this technology in education. This paper aims to contribute to the ongoing discussion on the role of AI in education and its potential implications for the future of learning by exploring how LLMs could be utilized in the teaching of mathematics in higher education and how they compare to the currently widely used computer algebra systems (CAS) and other mathematical tools. It argues that these innovative tools have the potential to provide functional and pedagogical opportunities that may influence changes in curriculum and assessment approaches.
A neural network solves, explains, and generates university math problems by program synthesis and few-shot learning at human level
Drori, I., Zhang, S., Shuttleworth, R., Tang, L., Lu, A., Ke, E., Liu, K., Chen, L., Tran, S., Cheng, N., Wang, R., Singh, N., Patti, T. L., Lynch, J., Shporer, A., Verma, N., Wu, E., & Strang, G. (2022). A neural network solves, explains, and generates university math problems by program synthesis and few-shot learning at human level. Proceedings of the National Academy of Sciences of the United States of America, 119(32), e2123433119. https://doi.org/10.1073/pnas.2123433119

more... less...

We demonstrate that a neural network automatically solves, explains, and generates university-level problems from the largest Massachusetts Institute of Technology (MIT) mathematics courses at a human level. Our methods combine three innovations: 1) using recent neural networks pretrained on text and fine-tuned on code rather than pretrained on text; 2) few-shot learning synthesizing programs that correctly solve course problems automatically; and 3) a pipeline to solve questions, explain solutions, and generate new questions indistinguishable by students from course questions. Our work solves university-level mathematics courses and improves upon state-of-the-art, increasing automatic accuracy on randomly sampled questions on a benchmark by order of magnitude. Implications for higher education include roles of artificial intelligence (AI) in automated course evaluation and content generation.

Math Education with Large Language Models: Peril or Promise?
Kumar, Harsh and Rothschild, David M. and Goldstein, Daniel G. and Hofman, Jake, Math Education with Large Language Models: Peril or Promise? (November 22, 2023). Available at SSRN: https://ssrn.com/abstract=4641653 or http://dx.doi.org/10.2139/ssrn.4641653

Can you trust ChatGPT and other LLMs in math?
This article is part of Demystifying AI, a series of posts that (try to) disambiguate the jargon and myths surrounding AI.

more... less...

from TechTalks
Future of software development with generative AI
Sauvola, J., Tarkoma, S., Klemettinen, M., Riekki, J., & Doermann, D. (2024). Future of software development with generative AI. Automated Software Engineering, 31(1), 26-. https://doi.org/10.1007/s10515-024-00426-z

more... less...

Generative AI is regarded as a major disruption to software development. Platforms, repositories, clouds, and the automation of tools and processes have been proven to improve productivity, cost, and quality. Generative AI, with its rapidly expanding capabilities, is a major step forward in this field. As a new key enabling technology, it can be used for many purposes, from creative dimensions to replacing repetitive and manual tasks. The number of opportunities increases with the capabilities of large-language models (LLMs). This has raised concerns about ethics, education, regulation, intellectual property, and even criminal activities. We analyzed the potential of generative AI and LLM technologies for future software development paths. We propose four primary scenarios, model trajectories for transitions between them, and reflect against relevant software development operations. The motivation for this research is clear: the software development industry needs new tools to understand the potential, limitations, and risks of generative AI, as well as guidelines for using it.
Learning to Make Rare and Complex Diagnoses With Generative AI Assistance: Qualitative Study of Popular Large Language Models
Abdullahi, T., Singh, R., & Eickhoff, C. (2024). Learning to Make Rare and Complex Diagnoses With Generative AI Assistance: Qualitative Study of Popular Large Language Models. JMIR Medical Education, 10, e51391–e51391. https://doi.org/10.2196/51391

more... less...

Patients with rare and complex diseases often experience delayed diagnoses and misdiagnoses because comprehensive knowledge about these diseases is limited to only a few medical experts. In this context, large language models (LLMs) have emerged as powerful knowledge aggregation tools with applications in clinical decision support and education domains.
Use of ChatGPT: What does it mean for biology and environmental science?
Agathokleous, E., Saitanis, C. J., Fang, C., & Yu, Z. (2023). Use of ChatGPT: What does it mean for biology and environmental science? The Science of the Total Environment, 888, 164154–164154. https://doi.org/10.1016/j.scitotenv.2023.164154

more... less...

Artificial intelligence (AI) large language models (LLMs) have emerged as important technologies. Recently, ChatGPT (Generative Pre-trained Transformer) has been released and attracted massive interest from the public, owing to its unique capabilities to simplify many daily tasks of people from diverse backgrounds and social statuses. Here, we discuss how ChatGPT (and similar AI technologies) can impact biology and environmental science, providing examples obtained through interactive sessions with ChatGPT. The benefits that ChatGPT offers are ample and can impact many aspects of biology and environmental science, including education, research, scientific publishing, outreach, and societal translation. Among others, ChatGPT can simplify and expedite highly complex and challenging tasks. As an example to illustrate this, we provide 100 important questions for biology and 100 important questions for environmental science. Although ChatGPT offers a plethora of benefits, there are several risks and potential harms associated with its use, which we analyze herein. Awareness of risks and potential harms should be raised. However, understanding and overcoming the current limitations could lead these recent technological advances to push biology and environmental science to their limits.

Social Sciences

Clinical science and practice in the age of large language models and generative artificial intelligence.
Schueller, S. M., & Morris, R. R. (2023). Clinical science and practice in the age of large language models and generative artificial intelligence. Journal of Consulting and Clinical Psychology, 91(10), 559–561. https://doi.org/10.1037/ccp0000848

more... less...

In this article, Schueller and Morris discuss the recent advances made from large language models (LLMs) and generative artificial intelligence (AI). These advances include supporting humans to provide better interventions, understanding processes in clinical interventions, and providing ethical considerations for the use of generative AI in clinical research and practice.
Capacity of Generative AI to Interpret Human Emotions From Visual and Textual Data: Pilot Evaluation Study
Elyoseph, Z., Refoua, E., Asraf, K., Lvovsky, M., Shimoni, Y., & Hadar-Shoval, D. (2024). Capacity of Generative AI to Interpret Human Emotions From Visual and Textual Data: Pilot Evaluation Study. JMIR Mental Health, 11, e54369–e54369. https://doi.org/10.2196/54369

more... less...

Mentalization, which is integral to human cognitive processes, pertains to the interpretation of one's own and others' mental states, including emotions, beliefs, and intentions. With the advent of artificial intelligence (AI) and the prominence of large language models in mental health applications, questions persist about their aptitude in emotional comprehension. The prior iteration of the large language model from OpenAI, ChatGPT-3.5, demonstrated an advanced capacity to interpret emotions from textual data, surpassing human benchmarks. Given the introduction of ChatGPT-4, with its enhanced visual processing capabilities, and considering Google Bard's existing visual functionalities, a rigorous assessment of their proficiency in visual mentalizing is warranted. The aim of the research was to critically evaluate the capabilities of ChatGPT-4 and Google Bard with regard to their competence in discerning visual mentalizing indicators as contrasted with their textual-based mentalizing abilities. The Reading the Mind in the Eyes Test developed by Baron-Cohen and colleagues was used to assess the models' proficiency in interpreting visual emotional indicators. Simultaneously, the Levels of Emotional Awareness Scale was used to evaluate the large language models' aptitude in textual mentalizing. Collating data from both tests provided a holistic view of the mentalizing capabilities of ChatGPT-4 and Bard. ChatGPT-4, displaying a pronounced ability in emotion recognition, secured scores of 26 and 27 in 2 distinct evaluations, significantly deviating from a random response paradigm (P<.001). These scores align with established benchmarks from the broader human demographic. Notably, ChatGPT-4 exhibited consistent responses, with no discernible biases pertaining to the sex of the model or the nature of the emotion. In contrast, Google Bard's performance aligned with random response patterns, securing scores of 10 and 12 and rendering further detailed analysis redundant. In the domain of textual analysis, both ChatGPT and Bard surpassed established benchmarks from the general population, with their performances being remarkably congruent. ChatGPT-4 proved its efficacy in the domain of visual mentalizing, aligning closely with human performance standards. Although both models displayed commendable acumen in textual emotion interpretation, Bard's capabilities in visual emotion interpretation necessitate further scrutiny and potential refinement. This study stresses the criticality of ethical AI development for emotional recognition, highlighting the need for inclusive data, collaboration with patients and mental health experts, and stringent governmental oversight to ensure transparency and protect patient privacy.
Towards a Psychology of Machines: Large Language Models Predict Human Memory
Huff, M., & Ulakçı, E. (2024). Towards a Psychology of Machines: Large Language Models Predict Human Memory. arXiv.Org. https://doi.org/10.48550/arxiv.2403.05152

more... less...

Large language models (LLMs) are demonstrating remarkable capabilities across various tasks despite lacking a foundation in human cognition. This raises the question: can these models, beyond simply mimicking human language patterns, offer insights into the mechanisms underlying human cognition? This study explores the ability of ChatGPT to predict human performance in a language-based memory task. Building upon theories of text comprehension, we hypothesize that recognizing ambiguous sentences (e.g., "Because Bill drinks wine is never kept in the house") is facilitated by preceding them with contextually relevant information. Participants, both human and ChatGPT, were presented with pairs of sentences. The second sentence was always a garden-path sentence designed to be inherently ambiguous, while the first sentence either provided a fitting (e.g., "Bill has chronic alcoholism") or an unfitting context (e.g., "Bill likes to play golf"). We measured both human's and ChatGPT's ratings of sentence relatedness, ChatGPT's memorability ratings for the garden-path sentences, and humans' spontaneous memory for the garden-path sentences. The results revealed a striking alignment between ChatGPT's assessments and human performance. Sentences deemed more related and assessed as being more memorable by ChatGPT were indeed better remembered by humans, even though ChatGPT's internal mechanisms likely differ significantly from human cognition. This finding, which was confirmed with a robustness check employing synonyms, underscores the potential of generative AI models to predict human performance accurately. We discuss the broader implications of these findings for leveraging LLMs in the development of psychological theories and for gaining a deeper understanding of human cognition.
Editorial: Generative artificial intelligence and the ecology of human development
Schuengel, C., & Heerden, A. (2023). Editorial: Generative artificial intelligence and the ecology of human development. Journal of Child Psychology and Psychiatry, 64(9), 1261–1263. https://doi.org/10.1111/jcpp.13860

more... less...

Commercial applications of artificial intelligence (AI) in the form of Large Language Models (LLMs) and Generative AI have taken centre stage in the media sphere, business, public policy, and education. The ramifications for the field of child psychology and psychiatry are being debated and veer between LLMs as potential models for development and applications of generative AI becoming environmental factors for human development. This Editorial briefly discusses developmental research on generative AI and the potential impact of generative AI on the hybrid social world in which young people grow up. We end by considering that the rapid developments justify increasing attention in our field.

Humanities

Large Language Model-Based Artificial Intelligence in the Language Classroom: Practical Ideas for Teaching
Bonner, E., Lege, R., & Frazier, E. (2023). Large Language Model-Based Artificial Intelligence in the Language Classroom: Practical Ideas for Teaching. Teaching English with Technology, 23(1), 23-.

more... less...

Large Language Models (LLMs) are a powerful type of Artificial Intelligence (AI) that simulates how humans organize language and are able to interpret, predict, and generate text. This allows for contextual understanding of natural human language which enables the LLM to understand conversational human input and respond in a natural manner. Recent examples of this, such as the Generative Pre-Trained Transformer (GPT) model, popularized by OpenAI's web application, ChatGPT, are able to complete an astounding variety of tasks when provided with simple language input. For education, LLMs can alleviate teacher curriculum and grading workloads and even perform specific tasks such as generating creative ideas for activities. Specifically for language learning, LLMs can draw on their immense corpus of language content to generate learner-centric materials to aid teachers in delivering targeted, personalized language instruction. The aim of this paper is to provide the reader with examples of how LLMs can be utilized for materials development, classroom activities, and providing feedback. After giving specific examples and explanations, the paper will conclude with a discussion of how this technology can provide teachers with new innovative ways to streamline the teaching process to focus on learner needs.
AI Text Generators and Teaching Writing: Starting Points For Inquiry
Mills, Anna (Curator). AI Text Generators and Teaching Writing: Starting Points For Inquiry. 2022. Last Updated November 18, 2023.

more... less...

What do teachers who assign writing need to know about AI text generators? How should we change our pedagogical practices, given the recent advances in AI Large Language Models (LLMs) such as OpenAI's ChatGPT? How should teachers participate in shaping policies around these technologies in our departments, institutions, and society?

To shape our individual and institutional responses to this new technology, writing teachers and scholars need more information about the workings, the quality, and the ethics of AI text generation. We may be concerned about possible learning loss and academic integrity violations due to unacknowledged use of AI in writing. We may want to help guide our students to think critically about this technology or to use it effectively. Or we may want to find ways to use these generators in our pedagogy. As faculty responsible for teaching writing as impactful, ethical intellectual activity, we need spaces to build our own understanding of AI text and discuss how to respond.

General Resources

The AI Con by Emily M. Bender; Alex Hanna
ISBN: 9780063418561

Publication Date: 2025-05-13

A smart, incisive look at the technologies sold as artificial intelligence, the drawbacks and pitfalls of technology sold under this banner, and why it's crucial to recognize the many ways in which AI hype covers for a small set of power-hungry actors at work and in the world. Is artificial intelligence going to take over the world? Have big tech scientists created an artificial lifeform that can think on its own? Is it going to put authors, artists, and others out of business? Are we about to enter an age where computers are better than humans at everything? The answer to these questions, linguist Emily M. Bender and sociologist Alex Hanna make clear, is "no," "they wish," "LOL," and "definitely not." This kind of thinking is a symptom of a phenomenon known as "AI hype." Hype looks and smells fishy: It twists words and helps the rich get richer by justifying data theft, motivating surveillance capitalism, and devaluing human creativity in order to replace meaningful work with jobs that treat people like machines. In The AI Con, Bender and Hanna offer a sharp, witty, and wide-ranging take-down of AI hype across its many forms. Bender and Hanna show you how to spot AI hype, how to deconstruct it, and how to expose the power grabs it aims to hide. Armed with these tools, you will be prepared to push back against AI hype at work, as a consumer in the marketplace, as a skeptical newsreader, and as a citizen holding policymakers to account. Together, Bender and Hanna expose AI hype for what it is: a mask for Big Tech's drive for profit, with little concern for who it affects.

Generative AI for Faculty

Web sources

Resources for Faculty

STEM

Social Sciences

Humanities

General Resources