Programming Data Scientists

Module Title:
Programming for Data Scientists
Module Code:
DSC6133

Module Content
  • General programming techniques
  • Python environment. Data structures: numbers, strings, lists, tuples, dictionaries. Basic language elements: loops, conditions, functions. Modules. Input and Output. Debugging. Machine learning and data mining in Python.
  • The NumPy package for scientific computing
  • The pandas data analysis library, including reading and writing of CSV files
  • The IPython and PyDev development environments
  • The Seaborn and Matplotlib 2D plotting library(drawing attractive statistical graphics and visualizations)
  • Language concepts of R: variables, vectors, matrices, data frames
  • R environment
  • Data manipulations.
  • Importing data from text and spreadsheet files.
  • Using external R packages.