• Tasks:
  • Input text, syllable-count conditioned lyric generation
  • Lyric translation: Singable lyric translation
    • ex) kor lyrics → intermediate(T5) → eng lyrics
  • Method
    • 기존 token based
    • Input:
      • T5 embedding of lyrics
      • Stable-diffusion image
      • or 둘 다
      • syllable count 예측 모델?
        • 0~100개 classification?
      • MetricGAN-like training?

        NEW TRAINING FRAMEWORK FOR SPEECH ENHANCEMENT USING REAL NOISY SPEECH

        • (DeepDream?)
    • Inference:
      • 될 때 까지 재생성 method
        • 평균 재생성 횟수 (=재생성 안할 시 실패확률)
        • 재생성함에도 실패확률
        • 재생성 안할 시 metric들 측정 비교
    • keyword forcing?
  • Experiments
    • BLEU?
    • syllable count얼마나 잘 맞추는지
    • translation: 두 가사가 얼마나 비슷한지
      • A Computational Evaluation Framework for Singable Lyric Translation (Haven Kim)
        • 논문 읽어보고, 해당 논문 evaluation 방법 따라하기?
    • Text-to-lyrics generation with image-based semantics and reduced risk of plagiarism (kento watanabe)
      • 논문 읽어보고, 해당 논문 evaluation 방법 따라하기?
      • dataset 뭐 썼는지?
      • PPL, NED
      • input text - lyric test dataset 직접 만들었음
  • Ablations:
      • syllable count 예측 모델?
    • syllable count 예측 모델만 썼을 때? (syllable count token 없음)
      • MetricGAN Discriminator?