hybrid-vocal-classifier (hvc)

Voice to text for songbirds


hybrid-vocal-classifier (hvc for short) makes it easy to segment and classify vocalizations with machine learning algorithms, and to compare the performance of different algorithms.

The main application is for scientists studying birdsong. (You can read more about that on the More about hybrid-vocal-classifier and songbird science page.)

Running hvc requires almost no coding. Users write simple Python scripts, and most will have to only adapt the examples from the documentation. Large batch jobs can be run with configuration files in YAML (a simple language that is meant to be easy for humans to read and write). Again, most users will only have to copy the example .yml files and then change a couple of parameters.

This code sample gives a high-level view of how you run hvc:

import hvc

# extract features from audio to train machine learning models
hvc.extract('extract_config.yml')  # using .yml config file
# train models/classifiers and select model with best accuracy
# use trained model to predict labels for unlabeled data

Advantages of hybrid-vocal-classifier

  • frees up hundreds of hours spent hand labeling data
  • completely open source, free
  • makes it easy to compare multiple machine learning algorithms
  • almost no coding required, configurable with text files
  • built on top of Python packages road-tested by the greater data-science community:
    numpy , scipy , matplotlib , scikit-learn , keras


see Installation


If you are having issues, please let us know.
Please post bugs on the Issue Tracker:
And please ask questions in the users’ group:


BSD license.

Code of Conduct

We welcome contributions to the codebase and the documentation, and are happy to help first-time contributors through the process. Project maintainers and contributors are expected to uphold the code of conduct described here: Contributor Covenant Code of Conduct