audioFlux

Github

GitHub Workflow Status (with branch) example branch parameter language PyPI - Version PyPI - Python Version Docs GitHub

DOI

audioFlux is a deep learning tool library for audio and music analysis, feature extraction. It supports dozens of time-frequency analysis transformation methods and hundreds of corresponding time-domain and frequency-domain feature combinations. It can be provided to deep learning networks for training, and is used to study various tasks in the audio field such as Classification, Separation, Music Information Retrieval(MIR) and ASR etc.

Table of Contents

Overview

audioFlux is based on data stream design. It decouples each algorithm module in structure, and can quickly and efficiently extract features of multiple dimensions. The following is the main feature architecture diagram.

You can use multiple dimensional feature combinations, select different deep learning networks training, study various tasks in the audio field such as Classification, Separation, MIR etc.

The main functions of audioFlux include transform, feature and mir modules.

1. Transform

In the time–frequency representation, main transform algorithm:

The above transform supports all the following frequency scale types:

The following transform are not supports multiple frequency scale types, only used as independent transform:

Detailed transform function, description, and use view the documentation.

The _synchrosqueezing_ or _reassignment_ is a technique for sharpening a time-frequency representation, contains the following algorithms:

2. Feature

The feature module contains the following algorithms:

3. MIR

The mir module contains the following algorithms:

Installation

language

The library is cross-platform and currently supports Linux, macOS, Windows, iOS and Android systems.

Python Package Install

To install the audioFlux package, Python >=3.6, using the released python package.

Using PyPI:

$ pip install audioflux 

Using Anaconda:

$ conda install -c tanky25 -c conda-forge audioflux

Other Build

Quickstart

More example scripts are provided in the Documentation section.

Benchmark

Server Performance

server hardware:

- CPU: AMD Ryzen Threadripper 3970X 32-Core Processor
- Memory: 128GB

Each sample data is 128ms(sampling rate: 32000, data length: 4096).

The total time spent on extracting features for 1000 sample data.

Package audioFlux librosa pyAudioAnalysis python_speech_features
Mel 0.777s 2.967s
MFCC 0.797s 2.963s 0.805s 2.150s
CQT 5.743s 21.477s
Chroma 0.155s 2.174s 1.287s

Mobile Performance

For 128ms audio data per frame(sampling rate: 32000, data length: 4096).

The time spent on extracting features for 1 frame data.

Mobile iPhone 13 Pro iPhone X Honor V40 OPPO Reno4 SE 5G
Mel 0.249ms 0.359ms 0.313ms 0.891ms
MFCC 0.249ms 0.361ms 0.315ms 1.116ms
CQT 0.350ms 0.609ms 0.786ms 1.779ms
Chroma 0.354ms 0.615ms 0.803ms 1.775ms

Documentation

Documentation of the package can be found online:

https://audioflux.top

Contributing

We are more than happy to collaborate and receive your contributions to audioFlux. If you want to contribute, please fork the latest git repository and create a feature branch. Submitted requests should pass all continuous integration tests.

You are also more than welcome to suggest any improvements, including proposals for need help, find a bug, have a feature request, ask a general question, new algorithms. Open an issue

Citing

If you want to cite audioFlux in a scholarly work, please use the following ways:

License

audioFlux project is available MIT License.

https://github.com/libAudioFlux/audioFlux