Skip to main content

Programming for Data Scientists

  1. General programming techniques 
  2. Python environment. Data structures: numbers, strings, lists, tuples, dictionaries. Basic language elements: loops, conditions, functions. Modules. Input and Output. Debugging. Machine learning and data mining in Python. 
  3. The NumPy package for scientific computing 
  4. The pandas data analysis library, including reading and writing of CSV files 
  5. The IPython and PyDev development environments 
  6. The Seaborn and Matplotlib 2D plotting library(drawing attractive statistical graphics and visualizations) 
  7. Language concepts of R: variables, vectors, matrices, data frames 
  8. R environment 
  9. Data manipulations. 
  10. Importing data from text and spreadsheet files.  
  11. Using external R packages.  
  12. Graphics