Note: This course has now concluded for AY2025/2026. I will update this website when the next run is confirmed.
Course description
Interested in applying your data science skills for the public good? In this course, we will learn how data science and AI can be used to tackle public sector challenges and improve societal outcomes. We start with an overview of Singapore's public sector and the range of policy issues Singapore grapples with, before exploring examples of how data science is used in the public sector. We then dive into three content areas: geospatial data analysis, natural language processing and LLMs, and responsible AI, before sharpening your skills in data science scoping and technical communication. The course culminates in a group project focused on applying your data science skills and knowledge to real problems. Join us on an exciting journey of learning how to use data science for the public good!
What you will learn from this course:
- How data science is used in the Singapore public sector
- Technical knowledge and practical skills for delivering good data science and AI projects
- Working in a team to apply data science to a public policy problem
Course outline
In the first half of the course, we go through content and skills that will help you for your group project, covering both technical aspects, like geospatial and text data analysis, and crucial soft skills, like technical communication and project scoping. In the second half of the course, you will be given time to focus on your group projects, with optional consultations to help you along the way. The group presentations will be held in the last two weeks of the semester.
- Week 1 (13 Jan 2026): Data science in the Singapore public sector
- Week 2 (22 Jan 2026): Analysing geospatial data
- Week 3 (29 Jan 2026): Analysing text data + introduction to LLMs
- Week 4 (5 Feb 2026): Scoping data science projects + group project kickoff
- Week 5 (12 Feb 2026): AI safety and fairness We will take a 2-week break for CNY and reading week. Note that the scoping document is due on 20 Feb 2026.
- Week 7 (5 Mar 2026): Technical communication
- Week 8 (12 Mar 2026): Building LLM applications with OpenAI by Gabriel Chua, Developer Experience Engineer at OpenAI (Note: OpenAI credits will be provided)
- Week 9-11 (19 Mar 2026 - 2 Apr 2026): Virtual consultations (no in-person class)
- Week 12 (9 Apr 2026): Presentations (Problem 1)
- Week 13 (16 Apr 2026): Presentations (Problem 2)
Note: This course has now concluded. Future runs may not follow the same course outline.
Course requirements
To do well in this course, students should:
- Have a strong understanding of key machine learning and data science concepts
- Be proficient in programming with R or Python
- Be comfortable with essential development tools (e.g. Git, venv, Docker)
- Have a basic grasp of the Singapore public sector and policy issues
- Be interested in applying data science to public policy problems
Useful readings
To help you prepare for the course, here are some recommended online resources:
Data Science in the Public Sector
- Solving Real World Problems in the Public Service with AI - Chang Sau Sheong
- AI Practice Technical Blog - GovTech Singapore
- AI in Cybersecurity: Fighting scams with AI and overcoming data poisoning - GovTech Singapore
- data.gov.sg - Open Government Products
Geospatial Data Analysis
- Tutorial 1.2 - Spatial analysis with Python - Henrikki Tenkanen
- Mapping Motor Vehicle Collisions in New York City - Todd W. Schneider
- A linguistic streetmap of Singapore - Michelle Fullwood
Natural Language Processing & LLMs
- LLM Course - Hugging Face
- The Illustrated Word2Vec - Jay Alammar
- The Illustrated Transformer - Jay Alammar
- Training language models to follow instructions with human feedback - OpenAI
AI Safety & Fairness
- Responsible AI Playbook - GovTech Singapore
- Machine Bias - ProPublica
- Inside Amsterdam’s high-stakes experiment to create fair welfare AI - MIT Technology Review
About the lecturer
I am a Staff Data Scientist at the AI Practice in GovTech Singapore, where I serve concurrently as the technical lead of the department's Responsible AI team and the Assistant Head of the department. In these roles, I lead applied research and implementation work across AI safety, robustness, fairness, and evaluation, and help shape the department's strategy, partnerships, and engagement efforts. Our team has developed and open-sourced resources including LionGuard, KnowOrNot, the ARC Framework, MinorBench, and the Responsible AI Playbook, while operationalising safety testing and guardrails for the Singapore public sector.
Beyond my work in government, I have been an adjunct lecturer at the National University of Singapore for the past two years, teaching an undergraduate course on data science and public policy. Prior to my current role, I held a dual appointment at MDDI's National AI Group, led the data science team at the Ministry of Manpower's Co-Lab unit, and was previously the Lead Data Scientist at Lovelytics in Washington D.C. I graduated from Columbia University in 2019 with a MA in Quantitative Methods in the Social Sciences (Data Science Focus), and from the University of Oxford in 2018 with a BA (Hons) in Philosophy, Politics and Economics. See my website here for more details.