Data/논문 읽기 10

[논문 번역 / 정리] Language Models are Few-Shot Learners

Language Models are Few-Shot LearnersRecent work has demonstrated substantial gains on many NLP tasks and benchmarks by pre-training on a large corpus of text followed by fine-tuning on a specific task. While typically task-agnostic in architecture, this method still requires task-specific fiarxiv.orgAbstractLanguage model의 사이즈를 키우는 것(scaling up)이 과제에 유연한 few-shot 성능을 크게 상승시킴gradient update나 fin..

Data/논문 읽기 2024.06.19

[논문 같이 읽기] How Contextual are Contextualized Word Representations? Comparing the Geometry of BERT, ELMo, and GPT-2 Embeddings

논문 링크 : How Contextual are Contextualized Word Representations? Comparing the Geometry of BERT, ELMo, and GPT-2 Embeddings Replacing static word embeddings with contextualized word representations has yielded significant improvements on many NLP tasks. However, just how contextual are the contextualized representations produced by models such as ELMo and BERT? Are there infini arxiv.org 그러나, ELM..

Data/논문 읽기 2023.01.01

[논문 요약] Should We Rely on Entity Mentions for Relation Extraction? Debiasing Relation Extraction with Counterfactual Analysis

논문 링크 : Should We Rely on Entity Mentions for Relation Extraction? Debiasing Relation Extraction with Counterfactual Analysis Recent literature focuses on utilizing the entity information in the sentence-level relation extraction (RE), but this risks leaking superficial and spurious clues of relations. As a result, RE still suffers from unintended entity bias, i.e., the spurious arxiv.org Github..

Data/논문 읽기 2022.12.25

[논문 같이 읽기] Denoising Diffusion Probabilistic Models

논문 링크 : Denoising Diffusion Probabilistic Models We present high quality image synthesis results using diffusion probabilistic models, a class of latent variable models inspired by considerations from nonequilibrium thermodynamics. Our best results are obtained by training on a weighted variational bound arxiv.org 유투브 설명 링크 : 수식 설명 링크 : [논문공부] Denoising Diffusion Probabilistic Models (DDPM) 설명 ─..

Data/논문 읽기 2022.12.06

[논문 요약] EDA: Easy Data Augmentation Techniques for Boosting Performance on Text Classification Tasks

논문 링크 : EDA: Easy Data Augmentation Techniques for Boosting Performance on Text Classification Tasks We present EDA: easy data augmentation techniques for boosting performance on text classification tasks. EDA consists of four simple but powerful operations: synonym replacement, random insertion, random swap, and random deletion. On five text classificati arxiv.org 코드 링크 : GitHub - jasonwei20/ed..

Data/논문 읽기 2022.11.17

[논문 같이 읽기] BERT : Pre-training of Deep Bidirectional Transformers for Language Understanding

논문 링크 : BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding We introduce a new language representation model called BERT, which stands for Bidirectional Encoder Representations from Transformers. Unlike recent language representation models, BERT is designed to pre-train deep bidirectional representations from unla arxiv.org Wikidocs 링크 : 02) 버트(Bidirectional Encoder..

Data/논문 읽기 2022.10.22

[논문 같이 읽기] Attention Is All You Need

논문 링크 : Attention Is All You Need The dominant sequence transduction models are based on complex recurrent or convolutional neural networks in an encoder-decoder configuration. The best performing models also connect the encoder and decoder through an attention mechanism. We propose a new arxiv.org Jay Alammar 논문 해설(모델 구조 및 작동 원리 with Animation) : The Illustrated Transformer Discussions: Hacker ..

Data/논문 읽기 2022.10.12

[논문 같이 읽기] Distributed Representations of Words and Phrases and their Compositionality

논문 링크 : Distributed Representations of Words and Phrases and their Compositionality The recently introduced continuous Skip-gram model is an efficient method for learning high-quality distributed vector representations that capture a large number of precise syntactic and semantic word relationships. In this paper we present several extens arxiv.org 자료 링크 : - 모델의 구조 및 동작에 대한 설명이 디테일하게 잘 되어 있음 - w..

Data/논문 읽기 2022.10.05

[논문 같이 읽기] Sequence to Sequence Learning with Neural Networks

논문 링크 : Sequence to Sequence Learning with Neural Networks Deep Neural Networks (DNNs) are powerful models that have achieved excellent performance on difficult learning tasks. Although DNNs work well whenever large labeled training sets are available, they cannot be used to map sequences to sequences. In this pap arxiv.org 참고 자료 링크 : 1) 시퀀스-투-시퀀스(Sequence-to-Sequence, seq2seq) 이번 실습은 케라스 함수형 AP..

Data/논문 읽기 2022.09.29

[논문 같이 읽기] Efficient Estimation of Word Representations in Vector Space

논문 링크 : Efficient Estimation of Word Representations in Vector Space We propose two novel model architectures for computing continuous vector representations of words from very large data sets. The quality of these representations is measured in a word similarity task, and the results are compared to the previously best per arxiv.org 참고 자료 링크 : 02) 워드투벡터(Word2Vec) 앞서 원-핫 벡터는 단어 벡터 간 유의미한 유사도를 계산..

Data/논문 읽기 2022.09.29