Home
Softono
podcast-data-modeling

podcast-data-modeling

Open source Jupyter Notebook
26
Stars
0
Forks
1
Issues
2
Watchers
6 years
Last Commit

About podcast-data-modeling

# Podcast data modeling This repository contains a podcast dataset and an implementation of the **A**dversarial **L**earning-based **P**odcast **R**epresentation (**ALPR**) introduced in the following paper: **Longqi Yang, Yu Wang, Drew Dunne, Michael Sobolev, Mor Naaman and Deborah Estrin. 2018. [*More than Just Words: Modeling Non-textual Characteristics of Podcasts*](http://www.cs.cornell.edu/~ylongqi/paper/YangWDSNE19.pdf). In Proceedings of [WSDM’19](http://www.wsdm-conference.org/2019/).** A pretrained model is also included. Please direct any questions to [Longqi Yang](http://www.cs.cornell.edu/~ylongqi/). #### If you use this data or algorithm, please cite: ``` @inproceedings{yang2019podcast, title={More than Just Words: Modeling Non-textual Characteristics of Podcasts}, author={Yang, Longqi and Wang, Yu and Dunne, Drew and Sobolev, Michael and Naaman, Mor and Estrin, Deborah}, booktitle={Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining}, year={2019}, ...

Platforms

Web Self-hosted

Languages

Jupyter Notebook

Podcast data modeling

This repository contains a podcast dataset and an implementation of the Adversarial Learning-based Podcast Representation (ALPR) introduced in the following paper:

Longqi Yang, Yu Wang, Drew Dunne, Michael Sobolev, Mor Naaman and Deborah Estrin. 2018. More than Just Words: Modeling Non-textual Characteristics of Podcasts. In Proceedings of WSDM’19.

A pretrained model is also included. Please direct any questions to Longqi Yang.

If you use this data or algorithm, please cite:

@inproceedings{yang2019podcast,
  title={More than Just Words: Modeling Non-textual Characteristics of Podcasts},
  author={Yang, Longqi and Wang, Yu and Dunne, Drew and Sobolev, Michael and Naaman, Mor and Estrin, Deborah},
  booktitle={Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining},
  year={2019},
  organization={ACM}
}

Code descriptions

  • Converting a WAV audio into a Mel-Spectrogram: wav_to_spectrogram.py.
  • Training ALPR: alpr.py (The files variable needs to be specified - it should contain a list of spectrogram files).
  • Extracting ALPR using a pretrained model: alpr_extractor.py.
  • Reproducing experimental results: energy_prediction.ipynb.

Data descriptions

Raw podcast audio URLs

Each line of these files contains an podcast episode represented by a JSON object with the following fields:

{
    "url": the URL to download the raw audio,
    "itunes_channel_id": the iTunes channel that the episode belongs to,
    "id": a unique epsiode ID,
    "title": the title of the episode
}

Prediction labels

Prediction features and raw audio (caveats: files are large)

  • Energy and seriousness predictions:

    • Spectrograms:
      • Download using the script download_spectrograms_attributes.sh.
      • Spectrograms are stored in the .npy (numpy array) format and are named following the rule:
        data/attributes_prediction_spectrograms/e_[episode id]_[offset].npy
    • Raw audio:
      • Download using the script download_audio_attributes.sh.
      • Audio is stored in the .wav format and is named following the rule:
        data/attributes_prediction_raw_audio/e_[episode id]_[offset].wav
  • Popularity prediction:

    • Spectrograms:

      • Download using the script download_spectrograms_popularity.sh.
      • Spectrograms are stored in the .npy (numpy array) format and are named following the rule:
        data/popularity_prediction_spectrograms/e_[episode id]_[0 -- length-1].npy
    • Transcriptions:

      • Download using the script download_transcriptions_popularity.sh.
      • Transcriptions are stored in the .txt format and are named following the rule:
        data/popularity_prediction_transcriptions/e_[episode id].txt
      • A transcription file lists transcribed words with the following format (a word per line):

      a spoken word \t starting time (ms) \t end time (ms)

    • Raw audio:

      • Download using the script download_audio_popularity.sh.
      • Audio is stored in the .wav format and is named following the rule:
        data/popularity_prediction_raw_audio/e_[episode id]_[0 -- length-1].wav

Reproducing experimental results using the pretrained model

from alpr_extractor import ALPRExtractor

extractor = ALPRExtractor()
extractor.load_model(path='pretrained_model/alpr')
features = extractor.forward((spectrograms + 2) / 2)