CSC5007Z Databases for Data Scientists
12 NQF credits at HEQSF level 9 Convener: Associate Professor S Berman
Course entry requirements: Acceptance into the Master's degree, specialising in Data Science.
This course will introduce students with little or no prior experience to the three cornerstone database technologies for big data, namely relational, NoSQL and Hadoop ecosystems. The course aims to give students an understanding of how data is organised and manipulated at large scale, and practical experience of the design and development of such databases using open source infrastructure. The relational part will cover conceptual, logical and physical database design, including ER modelling and normalisation theory, as well as SQL coding and best practices for performance enhancement. NoSQL databases were developed for big data and semi-structured data applications where relational systems are too inefficient; all four types of NoSQL architecture will be introduced. Distributed data processing is key in manipulating large data sets effectively. The final section of the course will teach the popular Hadoop technologies for distributed data processing, such as MapReduce programming and the execution model of Apache Spark.
DP requirements: 40% for assignment component. Assessment: Students will be assessed by 2 assignments (25% each) and an exam (50%). A sub-minimum of 40% will be required for each of the assignment and exam components of the course.
CSC5009W Data Science Minor Dissertation
90 NQF credits at HEQSF level 9 Convener: Associate Professor M M Kuttel
Course entry requirements: Successful completion of the coursework component of the Master's specialising in Data Science.
The research component of the degree is based on a 90 credit dissertation. The topic of the research will be based on an analysis of large data sets from Physics, Astronomy, Medicine, Finance or other areas of application using methodology learnt in coursework component of degree. Alternatively, the dissertation component may focus on methodological developments in Computer Sciences required for the analysis of large amount of data.
School of Information Technology
Skool vir Inligtingstegnologie