Tasks:
Input text, syllable-count conditioned lyric generation
Lyric translation: Singable lyric translation
ex) kor lyrics → intermediate(T5) → eng lyrics
Method
기존 token based
Input:
T5 embedding of lyrics
Stable-diffusion image
or 둘 다
syllable count 예측 모델?
0~100개 classification?
MetricGAN-like training?
NEW TRAINING FRAMEWORK FOR SPEECH ENHANCEMENT USING REAL NOISY SPEECH
(DeepDream?)
Inference:
될 때 까지 재생성 method
평균 재생성 횟수 (=재생성 안할 시 실패확률)
재생성함에도 실패확률
재생성 안할 시 metric들 측정 비교
keyword forcing?
Experiments
BLEU?
syllable count얼마나 잘 맞추는지
translation: 두 가사가 얼마나 비슷한지
A Computational Evaluation Framework for Singable Lyric Translation (Haven Kim)
논문 읽어보고, 해당 논문 evaluation 방법 따라하기?
Text-to-lyrics generation with image-based semantics and reduced risk of plagiarism (kento watanabe)
논문 읽어보고, 해당 논문 evaluation 방법 따라하기?
dataset 뭐 썼는지?
PPL, NED
input text - lyric test dataset 직접 만들었음
Ablations:
syllable count 예측 모델?
syllable count 예측 모델만 썼을 때? (syllable count token 없음)
MetricGAN Discriminator?