Understanding the relationships between different pitches as a form of tonality is a key element of listening skills in Western tonal music. Tonal hierarchies (i.e., genre-dependent differing prominence of tones) are reflected in the internal representations of tonal hierarchies (IRTH) in long-term memory. Over the past 40 years, research on how individuals — primarily students aged 6 to 15, as well as adults — acquire IRTH has yielded varied and sometimes contradictory conclusions about the timeline and underlying mechanisms of this process. This review aims to synthesize the evidence and critically examine potential reasons for the heterogeneity in prior findings. To this end, two approaches were applied. First, a Bayesian three-level meta-analysis of 60 effect sizes from 16 studies, reported in 13 articles, revealed a medium difference in IRTH sensitivity between younger and older participants. Second, a model comparison analysis based on cross-sectional data from a single study revealed a non-linear growth dynamic, with a larger increase during adolescence as the best model solution to describe the relationship between sensitivity and age. We also examined the considerable heterogeneity observed within and between studies, particularly how task-specific features of the operationalizations might account for these differences. These findings contribute to the development of theoretical models of music-related skill acquisition and suggest directions for future research.