Idea…? | Notion

Tasks:
Input text, syllable-count conditioned lyric generation
Lyric translation: Singable lyric translation
- ex) kor lyrics → intermediate(T5) → eng lyrics
Method
- 기존 token based
- Input:
  - T5 embedding of lyrics
  - Stable-diffusion image
  - or 둘 다
- - syllable count 예측 모델?
    - 0~100개 classification?
- - MetricGAN-like training?
    
    NEW TRAINING FRAMEWORK FOR SPEECH ENHANCEMENT USING REAL NOISY SPEECH
    - (DeepDream?)
- Inference:
  - 될 때 까지 재생성 method
    - 평균 재생성 횟수 (=재생성 안할 시 실패확률)
    - 재생성함에도 실패확률
    - 재생성 안할 시 metric들 측정 비교
- keyword forcing?
Experiments
- BLEU?
- syllable count얼마나 잘 맞추는지
- translation: 두 가사가 얼마나 비슷한지
  - A Computational Evaluation Framework for Singable Lyric Translation (Haven Kim)
    - 논문 읽어보고, 해당 논문 evaluation 방법 따라하기?
- Text-to-lyrics generation with image-based semantics and reduced risk of plagiarism (kento watanabe)
  - 논문 읽어보고, 해당 논문 evaluation 방법 따라하기?
  - dataset 뭐 썼는지?
  - PPL, NED
  - input text - lyric test dataset 직접 만들었음
Ablations:
- - syllable count 예측 모델?
- syllable count 예측 모델만 썼을 때? (syllable count token 없음)
- - MetricGAN Discriminator?