Primary tabs

Large Language Model Embeddings

Large language models (LLMs) like GPT, BERT, and Llama have revolutionized AI and natural language processing (NLP)by enabling sophisticated text understanding and generation. At the core of these models lies embeddings, which are high-dimensional representations of text that preserve  semantic relationships. These embeddings serve as the foundation for a wide range of machine learning (ML) and deep learning (DL) applications.

This workshop will offer a comprehensive hands-on introduction to Large Language Model Embeddings, starting with the foundational concepts of LLMs and the transformer architecture that powers them. Participants will learn how to use the popular Hugging Face Transformers Python library to efficiently work with pre-trained models. We will also discuss the importance of GPU acceleration  for generating embeddings  and fine-tuning models on  large datasets.

The workshop will use  practical applications such as sentiment analysis, document classification,  clustering, and dimensionality reduction.Participants will gain hands-on experience applying these concepts to real-world text data. They will preprocess text, generate embeddings using pre-trained models, and apply these embeddings for downstream ML/DL analyses. The interactive session will include live coding examples, demonstrations of embedding-based tasks, and opportunities for participants to experiment with their own data.

By the end of this workshop, attendees will have:

  •  Understanding of the theory behind LLMs and embeddings.

  • Gain practical skills for using embeddings in ML/DL pipelines.

  • Explore the versatility of embeddings across various domains and in various real-world applications.

 

Prerequisites: Basic knowledge of ML, DL, PyTorch. 

Length: 2 Hours

Level: Intermediate/Advanced

RegisterThursday, February 13, 2025 - 14:00 to 16:00