Foundations of Computing and Algorithmic Thinking: Variables, data types, control flow (loops, conditionals), functions and modular programming, introduction to R and Python environments, writing and debugging basic scripts.
Computing for Data Science
This module provides the computational framework and algorithmic techniques necessary to implement data-driven solutions in accounting, auditing, and financial management. The module focuses on the application of methods transitioning from theoretical models to functional, scalable code in R and Python. The module first establishes a foundation in algorithmic logic and data structures, ensuring students can efficiently manipulate complex financial datasets. It then moves into advanced computational methods, covering the implementation of optimization algorithms for financial decision-making and the deployment of machine learning architectures.
Upon successful completion, students will be able to:
- Demonstrate a working understanding of algorithmic logic, control flow, and core data structures as applied to financial and accounting datasets.
- Write clean, efficient, and scalable code in R and Python to import, clean, transform, and analyze complex financial datasets.
- Implement optimization algorithms to support financial decision-making tasks such as portfolio construction, cost minimization, and resource allocation.
- Build, train, and evaluate machine learning models relevant to accounting and finance applications, including classification, regression, and anomaly detection.
- Critically assess the suitability and limitations of alternative computational approaches for a given financial or auditing problem.
- Develop and apply data computation tasks via Large Language Models (LLMs).
10 thematic units across the semester.
Data Structures and Financial Data Manipulation: Arrays, lists, dictionaries, and dataframes, importing and cleaning financial datasets, merging and reshaping data, handling missing values and outliers, application on case studies with R and Python.
Numerical Methods and Optimization I: Mathematical foundations of optimization, unconstrained optimization, gradient descent and its variants, convergence criteria, implementation in Python.
Numerical Methods and Optimization II: Constrained optimization, linear and quadratic programming, applications in portfolio optimization and resource allocation, interpretation and application on case studies with Python.
Machine Learning I – Supervised Learning: Regression and classification frameworks, train/test split and cross-validation, decision trees and random forests, model evaluation metrics, application on financial datasets with Python.
Machine Learning II – Supervised Learning Advanced: Ensemble methods (bagging, boosting, gradient boosting), regularization (lasso and ridge), hyperparameter tuning, interpretation and application on case studies with Python.
Machine Learning III – Unsupervised Learning: Clustering algorithms (k-means, hierarchical), dimensionality reduction (PCA), anomaly detection for fraud and financial distress, Large Language Models in Computing for Data Science: processing of unstructured data, Retrieval Augmented Generation (RAG), Model Context Protocol (MCP) servers, Agentic AI and validation mechanisms, application on case studies with Python.
Natural Language Processing for Finance: Text preprocessing and feature extraction, bag-of-words and TF-IDF representations, sentiment analysis on financial text, introduction to transformer-based models (FinBERT), application on earnings calls and annual reports.
Scalable Data Pipelines and Workflow Automation: Writing reproducible and modular pipelines, introduction to version control (Git), batch processing of large financial datasets, scheduling and automation, application on case studies with R and Python.
Deployment and Communication of Data Science Solutions: Model serialization and deployment basics, building reproducible research reports (R Markdown, Jupyter), communicating analytical results to non-technical audiences, professional and ethical considerations in financial data science.
Description of the assessment process
Assessment Language, Assessment Methods, Formative or Summative, Multiple Choice Test, Short Answer Questions, Essay Development Questions, Problem Solving, Written Assignment, Report/Report, Oral Examination, Public Presentation, Laboratory Paper, Clinical Patient Examination, Artistic Interpretation, Other/Other
Explicitly defined assessment criteria and if and where they are accessible by students are mentioned.
The module assessment language is in English and students are expected to exhibit the required level of proficiency.
The assessment of the course consists of:
Midterm Exam (40%, problem solving)
Final exam (60%, problem solving)
The evaluation criteria across modes of assessment include the following:
Demonstration of key knowledge related to the content of course
Demonstration of an ability to apply the knowledge in a given problem or case study
Critical ability evident in applying appropriate methods/knowledge in a given case and/or developing theory-based and literature based arguments.
Structure and presentation
Use of English language
More detailed assessment criteria will be provided to you in the module handbook document or posted on the course webpage, if deemed necessary.
- Bird, S., Klein, E. and Loper, E. (2009) Natural Language Processing with Python. O'Reilly Media.
- Gardener, M. (2012). Beginning R: the statistical programming language. John Wiley & Sons.
- Jones, E., Harden, S., & Crawley, M. J. (2022). The R book. John Wiley & Sons.
- Jurafsky, D. and Martin, J.H. (2009) Speech and Language Processing. 2nd edn. Pearson Prentice Hall. (3rd edition draft available at web.stanford.edu/~jurafsky/slp3/)
- Raschka, S. and Mirjalili, V. (2019) Python Machine Learning. 3rd edn. Packt Publishing.
- Vasiliev, Y. (2020). Natural language processing with Python and spaCy: A practical introduction. No Starch Press.
- Other library sources, including journal articles accessible through the Library, as assigned by the instructor.