Creating percentile charts of body mass index by age and gender

Author: Chris Martin
Institution: Crystallise Limited
Type of case study: Research

About the research

The body mass index, or BMI, is a calculation used to determine the level of body fat and it can help doctors determine an individual’s overall fitness and their risk of developing chronic diseases.

When considering BMI among the population, it is important to know average values by age and gender. However, the impacts of the variation of in body mass index are non-linear, with the negative consequences being concentrated in individuals with the highest or lowest values. Modelling the effect of variation in BMI on health outcomes requires an understanding of the distribution of the BMI and not just average values. When assessing health risks, it is useful for health practitioners to be able to reference an individual to the general population. Percentile charts represent a convenient graphical way of putting an individual’s BMI into perspective.

Crystallise, an analytics and modelling company that services the finance, healthcare and pharmaceuticals sector, explored the distribution of body mass index as a health risk factor. More specifically, the company tried to identify the distribution of body mass index by age and gender among the population in England. This work contributed to Crystallise’s development of morbidity and mortality models to service the finance, healthcare and pharmaceuticals industries.

In the healthcare sector, the Crystallise Sonata mortality model is a freely accessible tool designed for use by doctors and nurses wanting to communicate risk with patients based upon factors like smoking, blood pressure and body mass index. It can be used to motivate healthy behaviour change or inform shared decision-making on treatment.

It is well established that the statistical judgements of even statistically skilled experts are prone to bias and that this can lead to sub-optimal decision-making (Kahnemman 2002). Use of ergonomic and intuitive tools like Sonata designed to be used in real-time with patients may help minimize the impact of biased risk estimation on decision-making in the consultation and encourage greater patient participation on the process.

Considerable resources are needed in the development of such models, and the return on investment can be achieved by applying the same methodologies in the pharmaceuticals industry when estimating the potential impact of novel treatments, or in the financial sector for life insurance or pensions risk.


Research was conducted using the Health Survey for England 2012 data for parameterisation; parameterisartion allowed individual risk factor values to be compared to the background population averages. Gamma distributions were fitted to the raw Health Survey for England 2012 data segmented by gender and deciles of age generating location, shape and rate parameters for each age band. The values of these parameters where then smoothly interpolated to individual year of age and calibrated to the observed mean values for BMI in the age bands. The statistical software ‘The R Project’ was used - utilising MASS, nls2, Hmisc,  ppcor, and plyr libraries.


Particular problems that were encountered and needed resolving included the handling of missing and implausible outlying data. This challenge was overcome by filtering out values out of reasonable bounds. Translating parameter values to single year of age was achieved using smooth interpolation, and a number of methods were tried including fitting 3rd order polynomials. This approach produced acceptable results, but may still be vulnerable to artefact at the extremes of age. Further analysis of the potential bias generated by missing values would contribute to this study.

Successes and conclusion

Crystallise successfully generated percentile charts of BMI by age and gender in a format that can support risk communication in the healthcare setting. This model contributed to the Sonata model, the mortality modelling framework developed by Crystallise. The model has been used in general practice consultations with patients to discuss risk, to motivate behaviour modification and to support shared decision-making of treatment decisions such as starting anti-hypertensive therapy.

The distribution functions will be fed into the next generation of risk communication models which will be developed by Crystallise. The same approach could be applied to other risk factors such as systolic blood pressure and the company continues its work on other parameters of interest such as the HbA1c and total cholesterol / high density lipoprotein cholesterol ratios, models which will be useful for modelling of the impact of treatment and also for prescribing policy - in addition to communicating risk for individuals.


To read more about this research:

The percentile charts and the Sonata model are freely accessible on the company’s website:

This site uses cookies

Some of these cookies are essential, while others help us to improve your experience by providing insights into how the site is being used.

For more detailed information please check our Cookie notice

Necessary cookies

Necessary cookies enable core functionality. This website cannot function properly without these cookies.

Cookies that measure website use

If you provide permission, we will use Google Analytics to measure how you use the website so we can improve it based on our understanding of user needs. Google Analytics sets cookies that store anonymised information about how you got to the site, the pages you visit, how long you spend on each page and what you click on while you’re visiting the site.