MGT-502 / 5 crédits
Enseignant: Vlachos Michail
Withdrawal: It is not allowed to withdraw from this subject after the registration deadline.
Hands-on introduction to data science and machine learning. We explore recommender systems, generative AI, chatbots, graphs, as well as regression, classification, clustering, dimensionality reduction, text analytics, neural networks. The course consists of lectures and coding sessions using Python.
- Basic regression methods. Predicting numeric values.
- Basic classification methods. Predicting categorical values: logistic regression, k-NN classification, decision trees, random forests
- Fundamental concepts: cost function and optimization, gradient descent, K-fold cross-validation, overfitting, model calibration, confusion matrix
- Dimensionality reduction: Principal Component Analysis, Multidimensional Scaling, non-linear dimensionality reduction (ISOMAP, t-SNE), curse of dimensionality
- Neural networks and Deep Learning
- Text analytics: text representation, sentiment classification, similarity search
- Recommender Systems
- Graph Analytics: Pagerank, centrality, 6-degrees of separation
- Generative AI and chatBots. chatGPT.
AI, Machine learning, Data Science Algorithms, Recommender Systems, Graphs, Generative AI, Regression, Classification, Dimensionality reduction, Clustering, Neural networks, Text analytics, Python
Statistics and data science (MGT-499)
Important concepts to start the course
- Basic Probability and Statistics knowledge (random variables, expectation, mean, conditional and joint distribution, independence, Bayes rule, central limit theorem)
- Basic linear algebra (matrix/vector multiplication, system of linear equations)
- Multivariate calculus (derivative w.r.t. vector and matrix variables)
- Basic programming skills (Python)
By the end of the course, the student must be able to:
- Describe the principal types of machine learning algorithms
- Investigate data, data types, and problems with the data
- Choose an appropriate Machine Learning method for a given task
- Implement Machine Learning algorithms in Python
- Optimize the main tradeoffs such as overfitting and computational cost vs accuracy
- Conduct a Data Science project
- Plan and carry out activities in a way which makes optimal use of available time and other resources.
- Demonstrate the capacity for critical thinking
- Access and evaluate appropriate sources of information.
- Use a work methodology appropriate to the task.
- Lab sessions: coding exercices
- Data Science projects
Expected student activities
The students are expected to:
- attend lectures and lab sessions;
- work on the weekly theory and coding exercises;
- complete assignments (graded);
- conduct data science projects making use of the theory learned during lectures and code developed during lab sessions (graded)
- Quiz: 40%
- Coding assignments: 30%
- Group Project: 30%
Virtual desktop infrastructure (VDI)
- [not mandatory] Data Science for Business: What You Need to Know about Data Mining and Data-Analytic Thinking, by Foster Provost and Tom Fawcett
Ressources en bibliothèque
Slides will be made available on the course Moodle page. Notebooks will be made available in a GitHub repository.
Dans les plans d'études
- Semestre: Printemps
- Nombre de places: 40
- Forme de l'examen: Pendant le semestre (session d'été)
- Matière examinée: Data science and machine learning
- Cours: 3 Heure(s) hebdo x 14 semaines
- Exercices: 2 Heure(s) hebdo x 14 semaines
Semaine de référence