허깅페이스 10

Hugging Face, Training a causal language model from scratch

Training a causal language model from scratch - Hugging Face Course Up until now, we’ve mostly been using pretrained models and fine-tuning them for new use cases by reusing the weights from pretraining. As we saw in Chapter 1, this is commonly referred to as transfer learning, and it’s a very successful strategy for a huggingface.co Causal language model을 처음부터 학습시켜보는 강의 내용. 여기서 Text generation ..

Data/Information 2022.04.01

Hugging Face, Summarization

Main NLP tasks - Hugging Face Course In this section we’ll take a look at how Transformer models can be used to condense long documents into summaries, a task known as text summarization. This is one of the most challenging NLP tasks as it requires a range of abilities, such as understandin huggingface.co 문서를 요약하는 text summarization에 대해 알아보자. 필요한 데이터를 load하고 랜덤 샘플을 뽑아 출력해보았다. English와 Spanish의 b..

Data/Information 2022.03.25

Hugging Face, Token classification

Main NLP tasks - Hugging Face Course The first application we’ll explore is token classification. This generic task encompasses any problem that can be formulated as “attributing a label to each token in a sentence,” such as: Of course, there are many other types of token classification huggingface.co 이전에 Tokenizer에 대해 학습하였는데 이제는 그것을 활용해서 토큰을 분류해주는 Token classification에 대해 학습해보고자 한다. 여기서 소개하는 것은..

Data/Information 2022.03.16

Huggig Face, Tokenizers

The 🤗 Tokenizers library - Hugging Face Course Introduction In Chapter 3, we looked at how to fine-tune a model on a given task. When we do that, we use the same tokenizer that the model was pretrained with — but what do we do when we want to train a model from scratch? In these cases, using a tokeni huggingface.co 코스 6에 해당하는 내용 1. 기존 토크나이저로 새 토크나이저 학습시키기 데이터셋을 불러오고 사용할 데이터를 확인. generator를 활용해 토..

Data/Information 2022.02.25

Hugging Face, Datasets

The 🤗 Datasets library - Hugging Face Course Introduction In Chapter 3 you got your first taste of the 🤗 Datasets library and saw that there were three main steps when it came to fine-tuning a model: Load a dataset from the Hugging Face Hub. Preprocess the data with Dataset.map(). Load and compute huggingface.co 오늘은 chapter 5 내용을 리뷰하고자 한다. 1. Data 불러오기 데이터 로드 및 확인, field를 통해 어디에서 data를 가져올지 지정(j..

Data/Information 2022.02.24

Hugging Face, Hub와 Repository 활용

Sharing models and tokenizers - Hugging Face Course In the steps below, we’ll take a look at the easiest ways to share pretrained models to the 🤗 Hub. There are tools and utilities available that make it simple to share and update models directly on the Hub, which we will explore below. We encourage al huggingface.co 위 링크, Course 4의 Sharing pretrained models의 내용 요약입니다. 노트북 환경에서 notebook_login()을..

Data/Information 2022.02.22

Hugging Face, pretrained models 불러오기

Sharing models and tokenizers - Hugging Face Course Using pretrained models The Model Hub makes selecting the appropriate model simple, so that using it in any downstream library can be done in a few lines of code. Let’s take a look at how to actually use one of these models, and how to contribute back to huggingface.co 위의 링크의 내용 1. 파이프라인 패키지를 통해 불러오기 - 가장 단순하지만 Task에 맞는 모델을 불러와야함. 2. 모델 아키텍쳐 패키..

Data/Information 2022.02.22

Hugging Face, fine-tuning 까지

Fine-tuning a pretrained model - Hugging Face Course Now we’ll see how to achieve the same results as we did in the last section without using the Trainer class. Again, we assume you have done the data processing in section 2. Here is a short summary covering everything you will need: Before actually writi huggingface.co 최근 허깅페이스의 코스를 학습 중이다. 허깅페이스는 NLP 문제들을 쉽게 해결할 수 있게 도와주는 API이자 HUB라고 보면 되겠다. ..

Data/Information 2022.02.22