theadamsabra's picture
Update README.md
9369481
metadata
license: openrail

Oud (عود) Unconditional Diffusion

The Oud is one of the most foundational instruments to all of Arab music. It can be heard in nearly every song, whether the subgenre is rooted in pop or classical music. Its distinguishing sound can be picked out of a crowd of string instruments with little to no training. Our Unconditional Diffusion model ensures that we show respect to the sound and culture it has created. This project could not have been done without the following audio diffusion tools.

Usage

Usage of this model is no different from any other audio diffusion model from HuggingFace.

import torch
from diffusers import DiffusionPipeline

# Setup device and create generator
device = "cuda" if torch.cuda.is_available() else "cpu"
generator = torch.Generator(device=device)

# Instantiate model
model_id = "mijwiz-laboratories/oud_diffusion_unconditional_256"
audio_diffusion = DiffusionPipeline.from_pretrained(model_id).to(device)

# Set seed for generator
seed = generator.seed()
generator.manual_seed(seed)

# Run inference
output = audio_diffusion(generator=generator)
image = output.images[0] # Mel spectrogram generated
audio = output.audios[0, 0] # Playable audio file

Limitations of Model

The dataset used was very small, so the diversity of snippets that can be generated is rather limited. Furthermore, with high intensity segments (think a human playing the instrument with high intensity,) the realism/naturalness of the generated oud samples degrades.