Course Description

In this Specialization learners will develop foundational Data Science skills to prepare them for a career or further learning that involves more advanced topics in Data Science. The specialization entails understanding what is Data Science and the various kinds of activities that a Data Scientist performs. It will familiarize learners with various open source tools, like Jupyter notebooks, MS Excel and Stata. It will teach you about methodology involved in tackling data science problems. Learners will complete hands-on labs and projects to apply their newly acquired skills and knowledge.

Course Outline

Title Details
Course Code & Title FIN 851: Data Sciences for Finance
Program(s) MS (Management) / MBA
Instructor Prof. Attaullah Shah
Website (if any):
Email For assignments:

For Other communications:

Office Location Basement, Academic Bloc.
Office Contact Hours: From 9 AM to 2 PM
Course Description: The amount of data available to organizations and individuals is unprecedented. Financial services sectors, including securities & investment services and banking, have the most digital data stored per firm on average. Finance companies that want to maximize use of this available data require professionals who have a keen understanding of data science and know how to use it to solve meaningful business challenges.

This course provides a structured teaching environment where students learn classic data science methods, which are used as the bases for many financial technologies. At the end of the course, course participants will have applied the Python / Stata / Excel programming language and essential data science techniques to solve complex finance problems.

Course Resources: For learning programming, I recommend the following books by
Jake Vanderplas, both of which are freely available:1. A Whirlwind Tour of Python
(link )2. Python Data Science Handbook
(link )Lectures and exercise posted on Economics and Finance with Python
Course Assessment(s) 50% marks are based on weekly assignments that involve solving Python / Stata / Excel problems. The rest are based on mid term (20%) and comprehensive (30%).
Course Methodology The course is conducted in a computer lab. Each lecture starts with discussion of a model / issue related to financial data. The discussion is followed by a practical demonstration of how to solve the model / issue. Finally, students are given a chance to solve  the problems or similar problems on their own.
Course Objectives The course has the following objectives:

Enable students :

  1. To understand the basics of Python / Stata / Excel programming
  2. To understand different data structures
  3. To import data from different sources, and export to different formats
  4. To manipulate data, perform calculations on groups such as portfolios, firms, years, industries, etc.
  5. Provide students with a foundation for performing data analytics in finance-related roles both inside and outside the financial sector.
Learning Outcomes At the end of the course, students will:
1. Be able to use a major software for data management and analysis (might include Python or Stata, or any other software that is popular in the market)
2. Will develop relevant programming abilities
3. Will demonstrate skill in data management
4. Will execute statistical analyses with professional statistical software.
Behavioral Expectations/ Class Policies (if any): 1. Students are expected to reach class within 2 minutes of the start of the class
2. Students are expected to submit their assignments within one-week time
3. The preferred learning style for the course is participative. Therefore, students are encouraged to raise questions, comment on solutions, suggest alternative solutions to problems, and answer questions raised the instructor.

Course Schedule

Week No. Description     Resources
Week 1 Overview of the:
Week 1 Data Science for Finance
Week 1 Big Data
Week 1 Python
Week 1 Python Installation with Anaconda
Week 1 Overview of the Jupiter Notebook
Week 2 Python Programming Concepts
Week 2 Data types – Numbers, strings, Boolean, etc.
Week 2 Math Operations
Week 2 Python list
Week 2 Python Dictionary
Week 2 Manipulating lists
Week 3 Python Programming Concepts – 2
Week 3 Loops – FOR, WHILE, etc
Week 3 Writing Functions
Week 3 Built-in Methods / Functions
Week 3 Create a function – number to text
Week 4 Libraries – Pandas
Week 4 Create Pandas Series
Week 4 Compare Series
Week 4 Math operations on Series
Week 4 From Series to Python List or vice versa
Week 4 Filtering or sub-setting Series
Week 4 Descriptive statistics of a Series
Week 4 Common elements of two series
Week 4 Get previous values of Series using
Week 4 MAP function
Week 5 Libraries – Pandas DataFrames
Week 5 Import data as a DataFrame
Week 5 Sub-setting a DataFrame
Week 5 LOC vs ILOC methods
Week 5 Filtering Data using conditions
Week 5 Dropping column from a DataFrame
Week 5 Date and Time in Pandas
Week 6 Libraries – Pandas – Groupby Method
Week 6 Calculate Stock Returns in a Panel Data Quantitative Economics with python
Week 6 Convert Stock returns to a Monthly frequency
Week 6 Reducing data to a Monthly frequency
Week 6 Dropping duplicates
Week 6 Writing DataFrame to Excel | Stata format
Week 7 Tidy Data
Week 7 Reshape Data from wide to long format
Week 7 Reshape Data from wide to wide format
Week 7 Panel data Format
Week 8 Libraries – Pandas – Merging DataFrames
Week 8 Data Merge
Week 8 append
Week 8 Concatenate

Mid-Term Exam

Week 9-10 Fama and French Type Portfolio Creation
Week 9-10 Downloading and importing data into Pandas DataFrames
Week 9-10 Merging different datasets such as share prices, index, and risk-free rates
Week 9-10 Arranging data in panel format
Week 9-10 Classifying firms into portfolios using firm-characteristics such as size, book-to-market, etc.
Week 9-10  Finding Portfolio returns and risk
Week 11 Data Scrapping from Websites
Week 11 Installation of the required packages (beautifulsoup, chromedriver)
Week 11 Introduction to HTML tables and tags
Week 11 Identification of parts of the code that can be used in loops to automate data scraping
Week 11 Data scraping by HTML tags or classes
Week 11 Writing the scraped data to Pandas’ DataFrames
Week 12 Getting Started with Stata
Week 12 Introduction to Stata, DO files, and log files
Week 12 Making research reproducible with Stata
Week 12 Importing and exporting data
Week 12 Summary statistics, tabulation, rolling window statistics
Week 12  Creating publication quality tables with asdocx / asdoc
Week 13-14 Panel Data regression and univariate analysis
Week 13-14 Regression diagnostics
Week 13-14 Steps in Panel Data Analysis
Week 13-14 Pooled / OLS regression
Week 13-14 Fixed effects models
Week 13-14 Random effects model
Week 13-14 Hausman test
Week 15 Advanced Topics in Data Management using MS Excel
Week 15 Filtering techniques
Week 15 Conditional formatting
Week 15 Vlookup / index / match functions
Week 15 Pivot Tables
Week 16 Advanced Topics in Data Management using MS Excel – II
Week 16 Data Validation
Week 16 Data Protection
Week 16 Macros / VBA

Lecture Notes and Files

Week No. Description
LECTURE 1: Getting Started with Python  Download
Lecture 1.1: Practice on Python lists Download
Lecture 1.2: Advanced Topic: Python Dictionary Download
Lecture 2: Functions | Notes
Project: Create a Number to Text Function
Access Page
Lecture 3: Introduction to Pandas Series Download
Lecture 4: Pandas DataFrames Download
Lecture 4.1: Date and Time in Pandas Download
Lecture 5.1 : Pands – Groupby Download
Lecture 5.2 : Excercise – Groupby Download
Lecture 6: Merging Datasets Download
Lecture 7: Portfolios and Returns Download
Lecture 8: Tidy Data Download
Lecture 9: Data Cleaning Download

Assignments 2023

Assignment 1 & 2

String Assignment

Assignment: String Methods

Submission Link for Assignment 1 – Alice</a >

In this assignment, you will be using string methods to
process a text document and extract information from it. Follow the instructions
below to complete the assignment:

  1. Download the text document “alice.txt” from this
    link</a >. This document contains the text of the book “Alice’s Adventures in
    Wonderland” by Lewis Carroll.
  2. Write a Python script that reads in the contents of the “alice.txt” file and
    stores it as a string variable.
  3. Use string methods to process the text data and answer the following
    questions:a. How many times does the word “Alice” appear in the text?b. What
    is the longest word in the text?c. How many unique words are in the text?d.
    What is the most common word in the text (excluding common words such as
    “the”, “and”, etc.)?
  4. Print out the answers to the questions in a readable format (e.g. “The word
    ‘Alice’ appears 357 times in the text.”)
  5. Save your script as “” and submit it along with a text file
    of your answers to the questions.

Class files
Year: 2023

  1. Feb1, 2023: Week 1: Python Basics: Class1 | Class 2
  2. Feb 15, 2023: Week 2: String Methods : Class1| Python List | Practices on Python List
  3. Feb 22, 2023: Week 3: Loops
  4. March 1, 2023: Lecture 6 – Functions
  5. March 02, 2023: Lecture 7 – IF ELSE Statements
  6. March 15, 2023: Lecture 8 – Handling Files
  7. March 16, 2023: Lecture 9 – Reading Files – Exercise
  8. March 22, 2023: Lecture 10 – Pandas
  9. March 29, 2023:Lecture 11 – Reading files Revision
  10. April 13, 2023: Lecture 12- Pandas DataFrame
  11. April 14, 2023: Lecture 13 – Dates and Groupby – Pandas
  12. May 24, 2023: Lecture 1 DO file: Stata Basics
  13. May 24, 2023: Lecture 2: DO file: Stata – Use, Save, asdoc
  14. June 7, 2023: Lecture 3: DO file: Stata – Panel
  15. June 7, 2023: Lecture 3: Data file: Stata – PanelData.dta
  16. June 9, 2023: PDF file:  Stata Notes


Results Summary

View Results Summary

Marks and Comments of Week 1 Assignments:

View here

Marks and Comments Assignments 2

View here

Marks and Comments Assignments 3

View here

Assignment 4 – Comments and marks [ View ]

Assignment 5 – Comments and marks [ view ]

Assignment 6 – Comments and marks [ view ]

Assignment 7 – Comments and marks [ view ]