I wondered whether pitch might play a role. EDIT: singing in a higher key got me...

ciaranb4 · on Oct 16, 2020

Very unlikely that they are considering the actual sung or hummed pitch as very few people, including professional musicians, would start singing at the correct pitch without accompaniment.

Most likely they are mapping the interval between the sung notes and using that as part of the ‘melodic fingerprint’ for matching.