A large music model using MIDI representations

In this project we will train a large machine learning model to represent MIDI music events natively. This strikes a balance between representing music as raw audio, which is expensive and poorly interpretable and representing it symbolically, which introduces implicit biases, and reduces the amount of available data. This allows us, among other things to investigate the impact of biases toward western music in the data, and to see how well a model like this generalizes outside of its training distribution.