Generative AI Language Modeling with Transformers

Early bird sale! Unlock 10,000+ courses from Google, IBM, and more for 50% off. Save today.

Generative AI Language Modeling with Transformers

This course is part of multiple programs.

Instructors: Joseph Santarcangelo +2 more

10,597 already enrolled

Included with Coursera Plus

2 modules

Gain insight into a topic and learn the fundamentals.

4.5

(82 reviews)

Intermediate level

Recommended experience

8 hours to complete

Flexible schedule

Learn at your own pace

2 modules

Gain insight into a topic and learn the fundamentals.

4.5

(82 reviews)

Intermediate level

Recommended experience

8 hours to complete

Flexible schedule

Learn at your own pace

What you'll learn

Explain the role of attention mechanisms in transformer models for capturing contextual relationships in text
Describe the differences in language modeling approaches between decoder-based models like GPT and encoder-based models like BERT
Implement key components of transformer models, including positional encoding, attention mechanisms, and masking, using PyTorch
Apply transformer-based models for real-world NLP tasks, such as text classification and language translation, using PyTorch and Hugging Face tools

Skills you'll gain

Category: Generative AI
Category: Applied Machine Learning
Category: Text Mining
Category: Deep Learning
Category: Natural Language Processing
Category: Large Language Modeling
Category: PyTorch (Machine Learning Library)

Details to know

Shareable certificate

Add to your LinkedIn profile

Assessments

6 assignments

Taught in English

Build your subject-matter expertise

This course is available as part of

When you enroll in this course, you'll also be asked to select a specific program.

Learn new concepts from industry experts
Gain a foundational understanding of a subject or tool
Develop job-relevant skills with hands-on projects
Earn a shareable career certificate

There are 2 modules in this course

This course provides a practical introduction to using transformer-based models for natural language processing (NLP) applications. You will learn to build and train models for text classification using encoder-based architectures like Bidirectional Encoder Representations from Transformers (BERT), and explore core concepts such as positional encoding, word embeddings, and attention mechanisms.

The course covers multi-head attention, self-attention, and causal language modeling with GPT for tasks like text generation and translation. You will gain hands-on experience implementing transformer models in PyTorch, including pretraining strategies such as masked language modeling (MLM) and next sentence prediction (NSP). Through guided labs, you’ll apply encoder and decoder models to real-world scenarios. This course is designed for learners interested in generative AI engineering and requires prior knowledge of Python, PyTorch, and machine learning. Enroll now to build your skills in NLP with transformers!

In this module, you will learn the techniques to achieve positional encoding and how to implement positional encoding in PyTorch. You will learn how attention mechanism works and how to apply attention mechanism to word embeddings and sequences. You will also learn how self-attention mechanisms help in simple language modeling to predict the token. In addition, you will learn about scaled dot-product attention mechanism with multiple heads and how the transformer architecture enhances the efficiency of attention mechanisms. You will also learn how to implement a series of encoder layer instances in PyTorch. Finally, you will learn how to use transformer-based models for text classification, including creating the text pipeline and the model and training the model.

What's included

6 videos4 readings2 assignments2 app items1 plugin

6 videosTotal 39 minutes

Course Introduction2 minutesPreview module
Positional Encoding6 minutes
Attention Mechanism7 minutes
Self-attention Mechanism7 minutes
From Attention to Transformers7 minutes
Transformers for Classification: Encoder8 minutes

4 readingsTotal 17 minutes

Course Overview5 minutes
Specialization Overview7 minutes
Optimization Techniques for Efficient Transformer Training 3 minutes
Summary and Highlights2 minutes

2 assignmentsTotal 30 minutes

Graded Quiz: Fundamental Concepts of Transformer Architecture15 minutes
Practice Quiz: Positional Encoding, Attention, and Application in Classification15 minutes

2 app itemsTotal 105 minutes

Hands-on Lab: Attention Mechanism and Positional Encoding45 minutes
Hands-on Lab: Applying Transformers for Classification60 minutes

1 pluginTotal 2 minutes

Helpful Tips for Course Completion2 minutes

In this module, you will learn about decoders and GPT-like models for language translation, train the models, and implement them using PyTorch. You will also gain knowledge about encoder models with Bidirectional Encoder Representations from Transformers (BERT) and pretrain them using masked language modeling (MLM) and next sentence prediction (NSP). You will also perform data preparation for BERT using PyTorch. Finally, you learn about the applications of transformers for translation by understanding the transformer architecture and performing its PyTorch Implementation. The hands-on labs in this module will give you good practice in how you can use the decoder model, encoder model, and transformers for real-world applications.

What's included

10 videos6 readings4 assignments4 app items2 plugins

10 videosTotal 67 minutes

Language Modeling with the Decoders and GPT-like Models6 minutesPreview module
Training Decoder Models7 minutes
Decoder Models- PyTorch Implementation-Causal LM5 minutes
Decoder Models: PyTorch Implementation Using Training and Inference5 minutes
Encoder Models with BERT: Pretraining Using MLM5 minutes
Encoder Models with BERT: Pretraining Using NSP6 minutes
Data Preparation for BERT with PyTorch8 minutes
Pretraining BERT Models with PyTorch8 minutes
Transformer Architecture for Language Translation5 minutes
Transformer Architecture for Translation: PyTorch Implementation7 minutes

6 readingsTotal 9 minutes

Summary and Highlights1 minute
Summary and Highlights1 minute
Summary and Highlights1 minute
Course Conclusion2 minutes
Thanks from the Course team2 minutes
Congratulations and Next Steps2 minutes

4 assignmentsTotal 63 minutes

Graded Quiz: Advanced Concepts of Transformer Architecture30 minutes
Practice Quiz: Decoder Models12 minutes
Practice Quiz: Encoder Models12 minutes
Practice Quiz: Application of Transformers for Translation9 minutes

4 app itemsTotal 180 minutes

Hands-on Lab: Decoder GPT-like Models45 minutes
Hands-on Lab: Pretraining BERT Models60 minutes
Hands-on Lab: Data Preparation for BERT45 minutes
Lab: Transformers for Translation30 minutes

2 pluginsTotal 18 minutes

Cheat Sheet: Language Modeling with Transformers15 minutes
Course Glossary: Language Modeling with Transformers 3 minutes

Earn a career certificate

Add this credential to your LinkedIn profile, resume, or CV. Share it on social media and in your performance review.

Instructors

Instructor ratings

4.1 (13 ratings)

Joseph Santarcangelo

IBM

35 Courses1,966,408 learners

Offered by

IBM

Why people choose Coursera for their career

Felipe M.

Learner since 2018

"To be able to take courses at my own pace and rhythm has been an amazing experience. I can learn whenever it fits my schedule and mood."

Jennifer J.

Learner since 2020

"I directly applied the concepts and skills I learned from my courses to an exciting new project at work."

Larry W.

Learner since 2021

"When I need courses on topics that my university doesn't offer, Coursera is one of the best places to go."

Chaitanya A.

"Learning isn't just about being better at your job: it's so much more than that. Coursera allows me to learn without limits."

Learner reviews

4.5

82 reviews

5 stars
77.10%
4 stars
12.04%
3 stars
3.61%
2 stars
1.20%
1 star
6.02%

Showing 3 of 82

Reviewed on Dec 29, 2024

This course gives me a wide picture of what transformers can be.

Reviewed on Oct 10, 2024

Once again, great content and not that great documentation (printable cheatsheets, no slides, etc). Documentation is essential to review a course content in the future. Alas!

Reviewed on Nov 16, 2024

need assistance from humans, which seems lacking though a coach can give guidance but not to the extent of human touch.

Frequently asked questions

It will take only two weeks to complete this course if you spend 3–5 hours of study time per week.

It would be good if you had a basic knowledge of Python and a familiarity with machine learning and neural network concepts. It would be beneficial if you are familiar with text preprocessing steps and N-gram, Word2Vec, and sequence-to-sequence models. Knowledge of evaluation metrics such as bilingual evaluation understudy (BLEU) will be advantageous.

This course is part of the Generative AI Engineering Essentials with LLMs PC specialization. When you complete the specialization, you will prepare yourself with the skills and confidence to take on jobs such as AI Engineer, NLP Engineer, Machine Learning Engineer, Deep Learning Engineer, and Data Scientist.

Access to lectures and assignments depends on your type of enrollment. If you take a course in audit mode, you will be able to see most course materials for free. To access graded assignments and to earn a Certificate, you will need to purchase the Certificate experience, during or after your audit. If you don't see the audit option:

The course may not offer an audit option. You can try a Free Trial instead, or apply for Financial Aid.
The course may offer 'Full Course, No Certificate' instead. This option lets you see all course materials, submit required assessments, and get a final grade. This also means that you will not be able to purchase a Certificate experience.

When you enroll in the course, you get access to all of the courses in the Certificate, and you earn a certificate when you complete the work. Your electronic Certificate will be added to your Accomplishments page - from there, you can print your Certificate or add it to your LinkedIn profile. If you only want to read and view the course content, you can audit the course for free.

Generative AI Language Modeling with Transformers

What you'll learn

Skills you'll gain

Details to know

Build your subject-matter expertise

There are 2 modules in this course

Fundamental Concepts of Transformer Architecture

What's included

6 videosTotal 39 minutes

4 readingsTotal 17 minutes

2 assignmentsTotal 30 minutes

2 app itemsTotal 105 minutes

1 pluginTotal 2 minutes

Advanced Concepts of Transformer Architecture

What's included

10 videosTotal 67 minutes

6 readingsTotal 9 minutes

4 assignmentsTotal 63 minutes

4 app itemsTotal 180 minutes

2 pluginsTotal 18 minutes

Earn a career certificate

Instructors

Offered by

Why people choose Coursera for their career

Learner reviews

Frequently asked questions

Coursera

Community

More

Mobile App

Generative AI Language Modeling with Transformers

What you'll learn

Skills you'll gain

Details to know

Build your subject-matter expertise

There are 2 modules in this course

Fundamental Concepts of Transformer Architecture

What's included

Advanced Concepts of Transformer Architecture

What's included

Earn a career certificate

Instructors

Offered by

Why people choose Coursera for their career

Learner reviews

Frequently asked questions

How long does it take to complete the Specialization?

Do I need any background knowledge to complete this course successfully?

Which roles can I perform after completing this course?