Course Description


In this Specialization learners will develop foundational Data Science skills to prepare them for a career or further learning that involves more advanced topics in Data Science. The specialization entails understanding what is Data Science and the various kinds of activities that a Data Scientist performs. It will familiarize learners with various open source tools, like Jupyter notebooks, MS Excel and Stata. It will teach you about methodology involved in tackling data science problems. Learners will complete hands-on labs and projects to apply their newly acquired skills and knowledge.

Course Outline


Too often, finance courses stop short of making a connection between textbook finance and the problems of real-world business. This course bridges this gap between theory and practice by providing a nuts-and-bolts guide to solving common financial models with spreadsheets. This course takes the students step by step through financial models and various data management issues, showing how they can be solved.

Week No. Description    
Week 1 Lecture 1: Introduction to Python
Week 1 Lecture 1.1: Getting Started with Pandas
Week 1 Lecture 1.2  Tidy Data

Lecture Notes and Files

Week No. Description
LECTURE 1: Getting Started with Python  Download
Lecture 1.1: Practice on Python lists Download
Lecture 1.2: Advanced Topic: Python Dictionary Download
Lecture 2: Project: Create a Number to Text Function Access Page
Lecture 3: Introduction to Pandas Series Download
Lecture 4: Pandas DataFrames Download
Lecture 4.1: Date and Time in Pandas Download
Lecture 5.1 : Pands – Groupby Download
Lecture 5.2 : Excercise – Groupby Download
Lecture 6: Merging Datasets Download
Lecture 7: Portfolios and Returns Download
Lecture 8: Tidy Data Download
Lecture 9: Data Cleaning Download

Assignment 9 : Advanced Excel

Marks = 20
Deadline = August 27, 2020; till 11:59:59 PM
Submission Link = Submit on google classroom

Tasks to be completed in the assignment

Important Note: Do not submit files that were created during a lecture in the lab. You are supposed to submit files that you have created at home. The purpose is to have as much practice as possible. Use as many comments and examples as possible. Marks are based on the size, examples, and comments inside the submitted assignments.


1. Show the use of VLOOKUP for the following three uses: (4.5 marks)

1.1 Merge two dataset of your own choice (Examples can be merging share prices data with index on the basis of dates)

1.2 Make a data retrieval form (Examples can be Students data, where we retrieve students information based on students registration number)

1.3 Assign quota marks if students belong from different districts (Example: Make a list of students and in second column write their domiciles. Make another list where domiciles have different points, such as Peshawar has 10, Dir has 20, Bannu has 30, etc. Then use vlookup function to assign these scores to students in the first list)

2. Macro Exercise (5.5 Marks)

2.1 Make a macro that copies data from one sheet to another and then deletes the data on the first sheet. Give this macro a short-cut of Ctr + q

2.2 Make a macro that formats a data table in the following way: a. Header row has light grey color b. Header row has bold text c. Header row has upper and lower cell borders d. The last row on the table has bottom cell border

2.3 Search internet to find and use at least 10 useful macros, assign these macros to clickable buttons

3.  Protection : (3 marks)

3.1 Use a separate Excel file, name it “Protected” and make this file password protection so that no one can open it without a password of 123

3.2 Show practice on data entry in one sheet and some formulas on another sheet, using the data from the first sheet. (Example: Account data on one sheet, and Ratios on another sheet)

3.3 Protect the formulas on the second sheet with a password 123, the formula cells should not be clickable

3.4 Make a sheet, name it secret. Hide that sheet and password protect it from unhiding.

4. IF Statements: (4 marks)

4.1 Use at least 5 different examples of IF and nested IF statements.

4.2 Show the use of OR functions with IF

4.3 Show the use of AND function with IF

5. Data Validation (3 Marks)

5.1 Show example of data validation using lists

5.2 Show examples of data validation using whole numbers

5.3 Show examples of data validation using text length

6. Import and Filtering (3 Marks)

6.1 Show practice on importing delimited text

6.2 show practices on filtering records (You may use merit list file from IMSciences website

6.3 Show practices on Pivot Tables

Please note: You can do all these in one excel file or in different files. If one excel file, then given proper names to excel sheets. If different excel files, then zip all files in one folder and upload.

Assignment 8 : Data Scrapping with Python

Marks = 20
Deadline = August 16, 2020; till 11:59:59 PM
Submission Link = Submit on google classroom

Tasks to be completed in the
assignment

Important Note: Do not submit files that were created during a lecture in the lab. You are supposed to submit files that you have created at home. The purpose is to have as much practice as possible. Use as many comments and examples as possible. Marks are based on the size, examples, and comments inside the submitted assignments.


  1. Find a website of your own choice, it should not be opendoors.pk
  2. Scrap Some Tables data
  3. Find some repetitive data, that can be easily identified by symbol, or date, or some other pattern
  4. Also try to scrap some data from the same website or from another website using CLASS or ID identifiers if HTML
  5. The assignment should demonstrate the understanding of loops, Pandas’ DataFrame, data conversion to Excel, etc.
  6. The assignment should also use the method of transposing the data
  7. Submit the Jupiter notebook and scrapped data in a zipped folder
  8. Do it from the point of view that it carries 20 marks, you should demonstrate that a significant amount of work has gone into doing this assignment. So, make sure that you apply almost all techniques that we learnt in the data scrapping or even better, if you can improve upon those. You are welcome to use code from the internet, not from your friends.

Assignment 7 : Research with Stata

Marks = 20
Deadline = July 25, 2020; till 11:59:59 PM
Submission Link = Submit on google classroom

Tasks to be completed in the assignment>

Important Note: Do not submit files that were created during a lecture in the lab. You are supposed to submit files that you have created at home. The purpose is to have as much practice as possible. Use as many comments and examples as possible. Marks are based on the size, examples, and comments inside the submitted assignments.


1. Use data of at least 100 companies (2010-2018). You can download the data
manually or using Python from this page https://opendoors.pk/financials/ or any other site for the following variables.
a. Total assets
b. Net income
c. Dividend Percentage
d. Face value per share
e. Equity
f. sales
g. Financial expenses
h. Total Shares

2. Import the data from Excel files

3. Create the following variables:
a. Firm size = log of total assets
b. Profitability = net income / total assets
c. Dividend payout = dps / eps
d. Leverage = total liabilities / total assets
e. The standard deviation of net income for each firm’s
f. growth = geometric mean of the annual increase in total sales
h. Year dummies
i. Create a dummy if a firm pays a dividend, else zero. call it divdum

4. From the above variable, develop some interesting hypotheses and test them using ttest and regressions.
(Just one example: Using transaction cost hypothesis: Large firms have better access to external finance, hence they pay more dividends as they do not rely desperately on internally generated funds)
More interesting topics can be found here https://opendoors.pk/home/research-topics/research-topics-in-finance/.
I do not believe in change if two students have chosen the same hypothesis :D
5. Arrange data in panel data format and assign numeric identifiers to each firm

6. Declare the data panel data

7. Winsorize the variables at 1st and 99th percentiles

8. Create residual vs fitted values plots and visually confirm influential and outlier observations

9. Remove influential and outlier observations using studentized residuals and Cook’s D

10.Test for panel-level heteroscedasticity

11. Test for panel-level serial correlation

12. Use all appropriate steps in the panel data model selection and use robust standard errors according to steps 10 and 11.

13. Report your descriptive statistics, correlations, VIF, Hausman test, and regression results (use nested models) using asdoc.

14. Make a zip folder and add the following to it:
a. Stata log file
b. Stata Do File
C. Stata Data file
D. Excel file containing the data
E. Word Files containing the results

Assignment 6 : Merging and Portfolios

Marks = 10
Deadline = June 22, 2020; till 11:59:59 PM
Submission Link = Submit on google classroom

Tasks to be completed in the assignment

Important Note: Do not submit files that were created during a lecture in the lab. You are supposed to submit files that you have created at home. The purpose is to have as much practice as possible. Use as many comments and examples as possible. Marks are based on the size, examples, and comments inside the submitted assignments.


1. Download share prices data of 50 companies either from this site https://opendoors.pk/share-prices/ or any other site for the period 1/1/2010 to 12/30/2018

2. Create a year variable in this dataset

3. Download index points for the same period from this page https://opendoors.pk/od/kse-100-index-points-historical-data-in-excel/ or any other site

4. Download company symbols, year, companies book value per share and shares outstanding from this page https://opendoors.pk/financials/ or any other site for the same 50 companies

5. Arrange data in panel data format.

6. Merge share prices data with the financial data (point 4 above) using symbol and year as merge criteria

7. Merge index points into this new dataset (obtained in point 6) using the date as merge criterion

8. In each year, classify firms into small or big based on median market capitalization (me = shares * price). Small firms should be tagged as 1 while big firms should be tagged as 2.

9. In each year, classify firms into three groups based on book-to-market (beme) ratio: firms with low beme ratio should be tagged as 1; medium as 2; and high as 3

10. Make 6 portfolios and find their monthly returns as per the following definitions:

SL = Monthly average returns of firms that have size =1 and beme = 1
SM = Monthly average returns of firms that have size =1 and beme = 2
SH = Monthly average returns of firms that have size =1 and beme = 3
BL = Monthly average returns of firms that have size =2 and beme = 1
BM = Monthly average returns of firms that have size =2 and beme = 2
BH = Monthly average returns of firms that have size =2 and beme = 3

11. Reduce the dataset to portfolio level and remove duplicates.

Assignment 5 : Groupby and Filtering

Marks = 5
Deadline = March 27, 2020; till 11:59:59 PM
Submission Link = MS / MBA Evening | MBA Morning

Tasks to be completed in the assignment

Important Note: Do not submit files that were created during a lecture in the lab. You are supposed to submit files that you have created at home. The purpose is to have as much practice as possible. Use as many comments and examples as possible. Marks are based on the size, examples, and comments inside the submitted assignments.


1. Import the following industries.xls file from our website and name it ind

industries.xls

2. Create stock returns for each firm

3. Convert the firms returns to monthly returns

4. Create monthly returns for each industry

5. Remove duplicates and keep only one observation for each industry in each month

6. Report descriptive statistics for each industry

7. Which industry has the highest risk returns ratio (Sharp ratio)?
(hint find the mean and standard deviation for each industry in the whole period)

8. Repeat all of the above steps using weekly returns

Assignment 4 : Pandas Filtering, Slicing

Marks = 5
Deadline = March 12, 2020
Submission Link = Deadline passed

Tasks to be completed in the assignment

Important Note: Do not submit files that were created during a lecture in the lab. You are supposed to submit files that you have created at home. The purpose is to have as much practice as possible. Use as many comments and examples as possible. Marks are based on the size, examples, and comments inside the submitted assignments.


1. Import a TSV file from a Computer folder and create a DataFrame, name it df2

2. Import Pipe (|) delimited file from this link into DataFrame and name the DataFrame as df3

3. Import data from this csv file into Panda’s DataFrame, name the dataframe as prices

4. Create stock returns for each firm

5. Find mean returns for all firms

6. Create a DataFrame with the name updays from the prices DataFrame by including only those rows where returns are higher than the mean returns

7. Create a DataFrame with the name downdays from the prices DataFrame by including only those rows where returns are lower than the mean returns

8. Find the standard deviation of returns of the updays DataFrame, call this variable as sdup

9. Find the standard deviation of the returns of the downdays DataFrame, call this variable as sddown

10. Find the log(sddown / sdup)

11. Show practice on the following functions using the symbol column from the prices dataframe:

11.1 count()
11.2 value_counts()
11.3 Find min(), max() of the returns column

12. Show practice on isin() method by creating two Pandas Series

14. Show practice on adding columns to a DataFrame

15. Show practice on dropping columns

16. Impress me with something that we have not studied in the class

Assignment 3 : Pandas Series

Marks = 5
Deadline = March 5, 2020
Submission Link =

Tasks to be completed in the assignment

Important Note: Do not submit files that were created during a lecture in the lab. You are supposed to submit files that you have created at home. The purpose is to have as much practice as possible. Use as many comments and examples as possible. Marks are based on the size, examples, and comments inside the submitted assignments.


1. Make a pandas series manually

2. Make a series from a python list

3. Make a series from another series

4. Practice on filtering data from a series

5. Compare two series

6. Perform mathematical operations on a series

7. Convert a pandas Series to python list

8. Use pd.read_csv() method to import a file into pandas

9. Filter that data, create a subset, and find descriptive statistics

10. Convert an object series to numeric series

11. Append one series to another

12. USe shift(), shift(-1) functions

13. Find percentage changes in a series

14. Practice on changing from on directory to another

15. Write DataFrame to Excel File

16. Import TSV file into DataFrame

17. Slice columns from DataFrame

18. Use two conditions to create a boolean series, and then slice data from dataframe using that boolean series

Assignment 2: Python Basics II

Marks = 5
Deadline = Feb 27, 2020
Submission Link =

Tasks to be completed in the assignment

Use Jupyter Notebook for completing the following tasks

1. Make the NumberToText function : minimum values acceptance is 999, though making it to accept any number would be great.

2. String to integers and integers to string practice

3. FOR loop

4. Use DIR command to show available attributes with Python objects

5. Practicing on as many string methods as possible from the dir(str) command

6. Combine two string objects with + symbol

7. Practice on RANGE function

8. Practice on FOR loop using RANGE function

9. USe FOR loop to show odd and even number when looping through RANGE items

10. Show the use of WHILE loop

Assignment 1: Python Basics

Marks = 5
Deadline = Feb 20, 2020
Submission Link =

Tasks to be completed in the assignment

Use Jupyter Notebook for completing the following tasks

1. Show the use of all mathametical operations including the PEMDAS rule

2. Show output of integers and floating numbers

3. Use comments

4. Write and print text

5. Show variable assignments

6. Assign some text to variables and then add more text to the same variable

7. Show practice on SLICING of string

8. SLICE from a given index upto a given index

9. Create lists and slice elements from list

10. Show the use of FOR loop

11. Show the use of WHILE loop

12. SHOW IF, ELIF, ELSE statements

Results


Results Summary

MS Old Batch

MS 2020

MBA Current


 

Marks and Comments of Week 1 Assignments:

MS / MBA Evening

MBA Moring


Marks and Comments Assignments 2

View Marks and Comments Here


Marks and Comments Assignments 3

View Marks and Comments Here


Marks and Comments Assignments 4

View Marks and Comments Here


Marks and Comments Assignments 5

View Marks and Comments Here


Marks and Comments Assignments 6

View Marks and Comments Here


Marks and Comments Assignments 7

View Marks and Comments Here


Marks and Comments Assignments 8

View Marks and Comments Here


Marks and Comments Assignments 9

View Marks and Comments Here