CS373 - Final Presentation Schedule - Mike Heroux Home Page

CSCI 373: Fall 2024 Final Presentation Schedule

Location: Room 270, Main Building, CSB and Virtual via Zoom

Family and friends are welcome to attend in person or via Zoom.

For virtual participation, register via Zoom to receive the connection link via email. The same Zoom link works for both sessions.

Register in advance: Zoom Registration Link

Monday, December 9, 2024

Session 1: 2:00 - 3:15 pm

Time	Speaker	Title	Abstract
2:00 pm	Alec Chamberlain	Portfolio Optimization Utilizing Machine Learning	Many individuals miss opportunities to maximize savings due to a lack of active financial management. This paper integrates machine learning techniques, specifically XGBoost and Long Short-Term Memory (LSTM) models, into stock portfolio optimization to enhance returns without increasing risk. By leveraging financial data from yfinance, XGBoost predicts stock performance, ranking stocks based on potential returns, while LSTM captures time-series patterns for improved prediction accuracy. Hierarchical Risk Parity (HRP) is applied to balance risk when allocating portfolio weights among top-performing stocks. Backtesting against the S&P 500 index demonstrates that our machine learning-driven portfolio achieves superior returns with comparable risk, highlighting the potential of machine learning to improve investment outcomes for individual investors.
2:15 pm	Will Magarian	Reeling in Phish: Using Random Forests to Catch Cyber Threats	Phishing attacks exploit human vulnerabilities to steal sensitive information, posing significant risks to individuals and organizations. Traditional detection methods struggle to adapt to evolving tactics, necessitating more robust solutions. This paper explores phishing detection using machine learning, focusing on Random Forests, which leverage decision trees to identify patterns in email data. By training on labeled datasets and extracting key features like metadata and content, Random Forests outperformed traditional techniques like Logistic Regression, achieving high accuracy with a false positive rate under 2%. The study emphasizes the value of integrating Random Forests into hybrid systems and suggests future advancements in feature selection and explainable AI to improve detection against evolving cyber threats. These findings highlight the potential for scalable and adaptive phishing defense systems.
2:30 pm	CJ Mahn	Predictive Maintenance: Past, Present, and Future	Predictive Maintenance (PdM) has evolved from traditional time-based methods to advanced data-driven models. Initially relying on periodic checks, PdM now leverages machine learning and sensor technologies. Random Forests (RF), with their ability to handle large datasets, missing data, and overfitting, are ideal for PdM applications. Today, RF models enhance maintenance across various industries. The future of PdM will integrate IoT sensors, real-time data processing, and advanced machine learning, enabling more accurate predictions, cost savings, and improved safety. With maturing IoT and cloud technologies, PdM systems will become increasingly automated and proactive.
2:45 pm	Maria Nathe	SQL Injection Attacks: Understanding, Preventing, and Mitigating	SQL Injection (SQLi) remains a critical threat to web application security. Despite advancements in cybersecurity, SQLi continues to exploit vulnerabilities, compromising sensitive data and system integrity. This paper investigates the evolution of SQLi attacks, including sophisticated methods such as Blind SQLi, Union-Based SQLi, and Out-of-Band SQLi. Advanced detection and mitigation approaches are analyzed, with a focus on machine learning-based intrusion detection systems, parameterized queries, and secure architectural practices. Results indicate that while traditional defenses such as Web Application Firewalls (WAFs) mitigate classic attacks, emerging technologies like Zero Trust Architecture and quantum-resistant encryption are essential for addressing future challenges. This comprehensive study highlights the enduring relevance of SQLi, providing actionable insights into prevention strategies and the need for continuous innovation in cybersecurity defenses.
3:00 pm	James Nguyen	Remote Attestation for TPM Devices	Remote attestation ensures the integrity of remote systems, a vital aspect of cybersecurity. Trusted Platform Modules (TPMs) offer a secure method for this, using cryptographic keys and measured boot processes to establish a chain of trust. This study examines TPM-based attestation, highlighting functionalities like PCR extensions and resets to securely verify hardware and software configurations. While TPMs provide reliable, cryptographically verifiable evidence, their adoption faces challenges in cost-sensitive environments like IoT. Despite these limitations, TPMs remain a robust solution, with potential for broader applicability through supplementary security measures and affordable alternatives.

Session 2: 3:30 - 4:45 pm

Time	Speaker	Title	Abstract
3:30 pm	Abbi Pexa	Backtesting in Algorithmic Trading: Mitigating Bias in Moving Averages	Algorithmic trading is prevalent in global financial markets, but many strategies fail due to optimization bias that leads to overfitting. My work focuses on moving average crossover strategies, exploring how rigorous backtesting can mitigate bias and ensure reliability across various market conditions. By employing walk-forward analysis, parameter sensitivity testing, and evaluations in different market environments, I showed that robust backtesting can identify strategies that generalize well to unseen data. The results indicate that these refined strategies achieve greater stability and consistent performance across datasets. This underscores the need for rigorous validation techniques in algorithmic trading and suggests that AI-powered simulations and real-time adaptive testing could be vital for improving strategy reliability and ethical practices in finance.
3:45 pm	Tyler Pohlman	Data Imputation Using Machine Learning Techniques	Data quality is essential for effective decision-making, yet missing data can compromise this goal. This study evaluates Missforest, a machine learning algorithm based on random forests, for imputing missing values under the Missing at Random (MAR) mechanism. Using metrics such as Root Mean Square Error (RMSE) and Percent Bias (PB), Missforest’s performance was compared to simple mean imputation methods. Results show that mean imputation introduces significant bias, suggesting insufficiency in maintaining feature relationships. Missforest provides superior accuracy and preserves data integrity. Additionally, trends such as neural network-driven approaches like Generative Adversarial Imputation Networks (GAIN) highlight advancements in handling complex data gaps. This work reinforces machine learning’s role in enhancing data preprocessing and analytics.
4:00 pm	Evan Quinn	The Present and Future of Race Strategy in F1	Formula 1 (F1), the pinnacle of motorsport, relies on optimizing race strategies to achieve victory in a millisecond-sensitive competition. Complex race conditions, advancing technologies, and sustainability goals demand innovative solutions. Challenges include managing variables like tire degradation, pit stop efficiency, and weather. While current methods struggle with real-time adaptability, this presentation looks at how those problems can be solved in the near future. This work examines current strategy solutions by using regression analysis for tire management and outlier detection for data refinement, while also exploring future technology like machine learning for real-time analytics and emerging tools like autonomous pit crews and drivers, hydrogen power, and weather-responsive AI, just to name a few.
4:15 pm	Daniel Schmitz	Verifying in a post quantum age	With the post quantum age quickly approaching, we need a new way to verify and trust the information and files we receive. Old ways are quickly becoming compromised. I will demonstrate a simple implementation of dilithium in Python.
4:30 pm	Josefa Zarate Dolores	Learning to See: How CNNs Transform Visual Data	Convolutional Neural Networks (CNNs) revolutionize how computers process visual data by mimicking the human visual system. This talk explores the core processes behind CNNs, including feature extraction, training, and classification, and their application in fields like medical imaging and autonomous vehicles. By analyzing current trends, I illustrate how CNNs excel in image recognition and predict their future evolution, focusing on integration with generative AI. This presentation aims to inform and inspire with insights into CNN advancements shaping the future of AI-driven vision technology.