Human-AI-Collaboration-For-Coding

Gespeichert in:
Bibliographische Detailangaben
Titel: Human-AI-Collaboration-For-Coding
Autoren: Ravi, Siddhardha
Quelle: Theses, Dissertations and Culminating Projects
Verlagsinformationen: Montclair State University Digital Commons
Publikationsjahr: 2025
Bestand: Montclair State University Digital Commons
Schlagwörter: human-AI code collaboration, generative AI, code quality improvement, AI-generated code, BigCodeBench, software engineering, AI-first refinement, human-guided generation, large language models (LLMs), LLM-as-a-Judge, Artificial Intelligence and Robotics
Beschreibung: AI-generated code, while rapidly producing functional solutions, often falls short in aspects like comprehensive error handling, robust documentation, and optimal architectural design, areas where human expertise excels. Conversely, humans can greatly benefit from AI's rapid code generation capabilities. This project proposes and evaluates "A Framework to Improve Code Quality by Utilizing Generative AI Coding Along With Human-Written Code", designed to create a synergy between AI and human intelligence for enhanced software development. Conducted over four weeks, the research leverages BigCodeBench as its core dataset to rigorously investigate how human intervention can improve AI-generated code quality, identify the most effective human-AI collaboration patterns, and determine the optimal balance between human effort and quality improvement. The framework explores three distinct collaboration models: AI-First Refinement, where AI generates and humans refine; Human-Guided Generation, with human-defined structure and quality requirements; and Iterative Co-Creation, involving repetitive AI generation and human modification. Code quality is quantified using a comprehensive suite of automated static analysis metrics, including Cyclomatic Complexity, Lines of Code, and PEP 8 compliance, complemented by LLM-as-a-Judge evaluations assessing readability, maintainability, robustness, and efficiency. A key success criterion is achieving a ≥25% improvement in code quality metrics with < 5 minutes of human input per task. Initial analyses suggest significant improvements in code robustness and maintainability through collaborative refinement. This thesis details the framework's architecture, methodologies, and empirical results, contributing valuable insights into effective human-AI collaboration for superior code quality.
Publikationsart: text
Dateibeschreibung: application/pdf
Sprache: unknown
Relation: https://digitalcommons.montclair.edu/etd/1580; https://digitalcommons.montclair.edu/context/etd/article/2584/viewcontent/Ravi__Siddhardha___Final_thesis_Redacted.pdf
Verfügbarkeit: https://digitalcommons.montclair.edu/etd/1580
https://digitalcommons.montclair.edu/context/etd/article/2584/viewcontent/Ravi__Siddhardha___Final_thesis_Redacted.pdf
Dokumentencode: edsbas.5063BFB0
Datenbank: BASE
Beschreibung
Abstract:AI-generated code, while rapidly producing functional solutions, often falls short in aspects like comprehensive error handling, robust documentation, and optimal architectural design, areas where human expertise excels. Conversely, humans can greatly benefit from AI's rapid code generation capabilities. This project proposes and evaluates "A Framework to Improve Code Quality by Utilizing Generative AI Coding Along With Human-Written Code", designed to create a synergy between AI and human intelligence for enhanced software development. Conducted over four weeks, the research leverages BigCodeBench as its core dataset to rigorously investigate how human intervention can improve AI-generated code quality, identify the most effective human-AI collaboration patterns, and determine the optimal balance between human effort and quality improvement. The framework explores three distinct collaboration models: AI-First Refinement, where AI generates and humans refine; Human-Guided Generation, with human-defined structure and quality requirements; and Iterative Co-Creation, involving repetitive AI generation and human modification. Code quality is quantified using a comprehensive suite of automated static analysis metrics, including Cyclomatic Complexity, Lines of Code, and PEP 8 compliance, complemented by LLM-as-a-Judge evaluations assessing readability, maintainability, robustness, and efficiency. A key success criterion is achieving a ≥25% improvement in code quality metrics with < 5 minutes of human input per task. Initial analyses suggest significant improvements in code robustness and maintainability through collaborative refinement. This thesis details the framework's architecture, methodologies, and empirical results, contributing valuable insights into effective human-AI collaboration for superior code quality.