This is an R Markdown report of the analysis of hospital prescription data.
I have transformed the ten original datasets, joined them together and analysed. I have prepared nine questions about this dataset. I have shown the total number and the total price of the prescriptions. Only in hospitals the spending on prescriptions from January till October 2022 consists of around £800M. The most prescribed medical area (by BNF chapter) is Central Nervous System. The most prescribed medicine by cost is Physeptone by number of items is Methadone. I also showed the time series analysis of the growing expenses by month and found the three-month moving average that is around £80M.
My aim doing this project was to improve my wrangling skills using
dplyr
and tidyverse
packages. And demonstrate
the ability to produce insights from the real world data.
Data comes from NHS Business Service Authority (NHSBSA) : https://www.nhsbsa.nhs.uk/prescription-data/prescribing-data/hospital-prescribing-dispensed-community It is a link where more information can be found.
The link to download actual data is here https://opendata.nhsbsa.net/dataset/hospital-prescribing-dispensed-in-the-community
The data captured from prescription processing is used to calculate reimbursement and remuneration. The data provided is based on England Hospital Trust prescribing only and includes prescription items which were prescribed in England and dispensed in the community in England as well as items prescribed in England and dispensed in Wales, Scotland, and the Isle of Man.
It is a monthly dataset. The data was downloaded in December 2022. The datasets available at the time were from January to October 2022.
This data excludes:
This data does not contain any information about the patients including their diagnosis and demographics.
Each monthly dataset consists of the next variables:
Period name is the month and year when prescriptions was recorded through the system.
BNF name and BNF code are the British National Formulary name of a medical product and a 15 digit code in which the first seven digits are allocated according to the categories in the BNF and the last 8 digits represent the medicinal product.
Hospital Trust code and names are the code and name of the trust where the product was prescribed.
Quantity is the quantity prescribed.
Total Items shows the number of times a product appears on a prescription form not the quantity prescribed.
Total Quantity is the number of items multiplied by the quantity prescribed.
Total Net Ingredient cost NIC is the basic price of a drug.
Total Actual Cost is the calculated value including discounts and some adjustments. Actual Cost only relates to England.
As an example I am showing here January 2022 data
#load the data
<- read_rds("00_data/hosp_prescrip_01_22_tbl.rds")
hosp_prescrip_01_22_tbl
<- read_rds("00_data/hosp_pr_2022_post.rds")
hosp_pr_2022_post
%>% glimpse() hosp_prescrip_01_22_tbl
## Rows: 140,478
## Columns: 10
## $ PERIOD <int> 202201, 202201, 202201, 202201, 202201, 202201, 20…
## $ BNF.NAME <chr> "Movicol Paediatric Chocolate oral powder 6.9g sac…
## $ BNF.CODE <chr> "0106040M0BBAJAB", "0404000M0BGAEAQ", "0501060D0AA…
## $ HOSPITAL.TRUST.CODE <chr> "R0A00", "R0A00", "R0A00", "R0A00", "R0A00", "R0B0…
## $ HOSPITAL.TRUST <chr> "MANCHESTER UNIVERSITY NHS FT", "MANCHESTER UNIVER…
## $ QUANTITY <dbl> 21.0, 20.0, 168.0, 224.0, 60.0, 30.0, 56.0, 56.0, …
## $ TOTAL.QUANTITY <dbl> 21.0, 20.0, 168.0, 224.0, 120.0, 30.0, 56.0, 56.0,…
## $ TOTAL.ITEMS <int> 1, 1, 1, 1, 2, 1, 1, 1, 1, 3, 1, 1, 2, 1, 1, 4, 1,…
## $ TOTAL.ACTUAL.COST <dbl> 2.9844190, 18.1116328, 11.0794958, 28.9009815, 23.…
## $ TOTAL.NIC <dbl> 3.07, 19.24, 11.83, 30.88, 25.52, 72.00, 2.90, 109…
I have made the transformations and joined individual datasets together to have one table.
The transformation included:
lubridate
packageThe first 6 rows of a final table that will be used for further analysis:
%>% glimpse() hosp_pr_2022_post
## Rows: 1,471,460
## Columns: 24
## $ PERIOD <int> 202201, 202201, 202201, 202201, 202201, 202201, …
## $ PERIOD.DATE <date> 2022-01-01, 2022-01-01, 2022-01-01, 2022-01-01,…
## $ PERIOD.YEAR <dbl> 2022, 2022, 2022, 2022, 2022, 2022, 2022, 2022, …
## $ PERIOD.MONTH <ord> Jan, Jan, Jan, Jan, Jan, Jan, Jan, Jan, Jan, Jan…
## $ BNF.NAME <chr> "Movicol Paediatric Chocolate oral powder 6.9g s…
## $ BNF.CODE <chr> "0106040M0BBAJAB", "0404000M0BGAEAQ", "0501060D0…
## $ BNF.CHAPTER.CODE <chr> "01", "04", "05", "10", "13", "11", "05", "04", …
## $ BNF.CHAPTER.PLUS.CODE <chr> "01: Gastro-Intestinal System", "04: Central Ner…
## $ BNF.CHAPTER.DESCRPT <chr> " Gastro-Intestinal System", " Central Nervous S…
## $ HOSPITAL.TRUST.CODE <chr> "R0A00", "R0A00", "R0A00", "R0A00", "R0A00", "R0…
## $ QUANTITY <dbl> 21.0, 20.0, 168.0, 224.0, 60.0, 30.0, 56.0, 56.0…
## $ TOTAL.QUANTITY <dbl> 21.0, 20.0, 168.0, 224.0, 120.0, 30.0, 56.0, 56.…
## $ TOTAL.ITEMS <int> 1, 1, 1, 1, 2, 1, 1, 1, 1, 3, 1, 1, 2, 1, 1, 4, …
## $ TOTAL.ACTUAL.COST <dbl> 2.9844190, 18.1116328, 11.0794958, 28.9009815, 2…
## $ TOTAL.NIC <dbl> 3.07, 19.24, 11.83, 30.88, 25.52, 72.00, 2.90, 1…
## $ TOTAL.ITEMS.COST <dbl> 2.9844190, 18.1116328, 11.0794958, 28.9009815, 4…
## $ HOSPITAL.TRUST <chr> "MANCHESTER UNIVERSITY NHS FOUNDATION TRUST", "M…
## $ CODE <chr> "R0A", "R0A", "R0A", "R0A", "R0A", "R0B", "R0D",…
## $ ORGANISATION <chr> "MANCHESTER UNIVERSITY NHS FOUNDATION TRUST", "M…
## $ ADDRESS.1 <chr> "COBBETT HOUSE", "COBBETT HOUSE", "COBBETT HOUSE…
## $ ADDRESS.2 <chr> "OXFORD ROAD, MANCHESTER", "OXFORD ROAD, MANCHES…
## $ AREA <chr> "MANCHESTER", "MANCHESTER", "MANCHESTER", "MANCH…
## $ REGION <chr> "GREATER MANCHESTER", "GREATER MANCHESTER", "GRE…
## $ POST.CODE <chr> "M13 9WL", "M13 9WL", "M13 9WL", "M13 9WL", "M13…
%>% head() hosp_pr_2022_post
In my report I wanted to answer the next questions:
%>%
hosp_pr_2022_post summarise(total.items = sum(TOTAL.ITEMS),
total.cost = sum(TOTAL.ITEMS.COST)) %>%
mutate(total.items.chr = scales::number(total.items),
total.cost.chr = scales::dollar(total.cost, prefix = "£")) %>%
select(total.items.chr, total.cost.chr) %>%
#rename the columns
rename("Total Items" = total.items.chr,
"Total Cost" = total.cost.chr) %>%
#add a table
flextable() %>%
autofit() %>%
bold(part = "header") %>%
add_footer_lines("") %>%
add_header_lines("Table 1. Prescriptions in 2022")
Table 1. Prescriptions in 2022 | |
---|---|
Total Items | Total Cost |
3 915 437 | £786,398,249 |
We can see on the Fig. 1 that the most spending on medicines is in the next BNF chapters: Central Nervous System, Skin and Malignant disease and Immunosuppression. Those three chapters cover 73.4% of all prescriptions in 2022. Already the first chapter, Central Nervous System, contributes to more than half of the spending and consists of more than £459M.
##data transformation
<- 7
n
%>%
hosp_pr_2022_post select(BNF.CHAPTER.CODE, BNF.CHAPTER.DESCRPT, TOTAL.ITEMS, TOTAL.ITEMS.COST) %>%
mutate(BNF.CHAPTER.CODE = as_factor(BNF.CHAPTER.CODE) %>%
fct_lump(n = n, w = TOTAL.ITEMS.COST)) %>%
group_by(BNF.CHAPTER.CODE, BNF.CHAPTER.DESCRPT) %>%
summarise(N.ITEMS = sum(TOTAL.ITEMS),
COST = sum(TOTAL.ITEMS.COST)) %>%
ungroup() %>% view()
#I've got not really what I wanted. I wanted to have one description as a factor and than total price and number, anyway try again below:
<- hosp_pr_2022_post %>%
top_prescr_by_chapt_tbl select(BNF.CHAPTER.PLUS.CODE, TOTAL.ITEMS, TOTAL.ITEMS.COST) %>%
mutate(BNF.CHAPTER.PLUS.CODE = as_factor(BNF.CHAPTER.PLUS.CODE) %>%
fct_lump(n = n, w = TOTAL.ITEMS.COST)) %>%
group_by(BNF.CHAPTER.PLUS.CODE) %>%
summarise(N.ITEMS = sum(TOTAL.ITEMS),
COST = sum(TOTAL.ITEMS.COST)) %>%
ungroup() %>%
#I want to arrange it in the right order for a plot
mutate(BNF.CHAPTER.PLUS.CODE = BNF.CHAPTER.PLUS.CODE %>% fct_reorder(COST)) %>%
mutate(BNF.CHAPTER.PLUS.CODE = BNF.CHAPTER.PLUS.CODE %>% fct_relevel("Other", after = 0)) %>%
arrange(desc(BNF.CHAPTER.PLUS.CODE)) %>%
#I want to add label to my plot
mutate(N.ITEMS_text = scales::number(N.ITEMS),
COST_text = scales::dollar(COST, scale = 1e-6, prefix = "£", suffix = "M")) %>%
#Cumulative percent
mutate(cum_pct = cumsum(COST)/sum(COST)) %>%
mutate(cum_pct_text = scales::percent(cum_pct)) %>%
#Rank
mutate(rank = row_number()) %>%
mutate(rank = case_when(
== max(rank) ~ NA_integer_,
rank TRUE ~ rank)) %>%
#Label text
mutate(label_text = str_glue("Rank: {rank}\nCost: {COST_text}\nCumPct: {cum_pct_text}"))
#Now the visualisation
<- top_prescr_by_chapt_tbl %>%
g1
#plot
ggplot(aes(COST, BNF.CHAPTER.PLUS.CODE)) +
#geometries
geom_segment(aes(xend = 0, yend = BNF.CHAPTER.PLUS.CODE),
colour = palette_light()[1],
size = 1) +
geom_point(aes(size = COST),
colour = palette_light()[1]) +
#labels
geom_label(aes(label = label_text),
hjust = "inward",
size = 2.5,
colour = palette_light()[1]) +
#Formating
scale_x_continuous(labels = scales::dollar_format(scale = 1e-6, prefix = "£", suffix = "M")) +
labs(
title = str_glue("Fig.1. Top {n} prescriptions in 2022"),
x = "Cost (£M)",
y = "BNF chapter",
caption = str_glue("Already first chapter contributes
to half of the spending on prescriptions")
+
)
theme_tq() +
theme(
legend.position = "none",
title = element_text(face = "bold"),
plot.caption = element_text(face = "bold.italic")
) g1
We can see that the first chapter number 04 contributes to the most expensive spending that exceed £148M (for the first three medicines). Physotone oral solution is the most expensive medicine taking into consideration the price and the quantity (around £60M and more than 76k items).
#data transformation
<- hosp_pr_2022_post %>% group_by(BNF.CHAPTER.CODE, BNF.CHAPTER.DESCRPT, BNF.NAME) %>%
top_3_med_bnf_chap_price summarise(N.ITEMS = sum(TOTAL.ITEMS),
COST = sum(TOTAL.ITEMS.COST)) %>%
arrange(desc(COST)) %>%
ungroup() %>%
group_by(BNF.CHAPTER.CODE) %>%
top_n(3)
#level adjustment
<- hosp_pr_2022_post %>%
sorted select(BNF.CHAPTER.CODE, BNF.CHAPTER.DESCRPT, BNF.NAME, TOTAL.ITEMS, TOTAL.ITEMS.COST) %>%
group_by(BNF.CHAPTER.CODE, BNF.CHAPTER.DESCRPT) %>%
summarise(N.ITEMS = sum(TOTAL.ITEMS),
COST = sum(TOTAL.ITEMS.COST)) %>%
arrange(desc(COST)) %>%
ungroup() %>% view()
<- sorted$BNF.CHAPTER.CODE
a
#presentation into table
%>%
top_3_med_bnf_chap_price arrange(BNF.CHAPTER.CODE, desc(COST)) %>%
arrange(factor(BNF.CHAPTER.CODE, levels = a)) %>%
filter(BNF.CHAPTER.CODE %in% a[1:7]) %>%
#change the presentation of the numbers in table
mutate(N.ITEMS = scales::number(N.ITEMS),
COST = scales::dollar(COST, prefix = "£")) %>%
#rename the columns
rename("BNF Chapter\nCode" = BNF.CHAPTER.CODE,
"Chapter Description" = BNF.CHAPTER.DESCRPT,
"BNF Name" = BNF.NAME,
"Number of Items" = N.ITEMS,
"Total Cost" = COST) %>%
#add a table
flextable() %>%
autofit() %>%
bold(part = "header") %>%
add_footer_lines("") %>%
add_header_lines("Table 2. Top 3 medicine in each BNF chapter (top 7) sorted by cost")
Table 2. Top 3 medicine in each BNF chapter (top 7) sorted by cost | ||||
---|---|---|---|---|
BNF Chapter | Chapter Description | BNF Name | Number of Items | Total Cost |
04 | Central Nervous System | Physeptone 1mg/ml oral solution sugar free | 76 060 | £59,905,519 |
04 | Central Nervous System | Circadin 2mg modified-release tablets | 40 066 | £48,753,512 |
04 | Central Nervous System | Espranor 8mg oral lyophilisates | 14 632 | £39,958,794 |
13 | Skin | Efudix 5% cream | 15 763 | £27,826,439 |
13 | Skin | Stelara 90mg/1ml solution for injection pre-filled syringes | 462 | £27,666,071 |
13 | Skin | Enstilar 50micrograms/g / 0.5 mg/g cutaneous foam | 6 486 | £3,986,420 |
21 | Appliances | Thealoz Duo eye drops preservative free | 14 283 | £14,098,579 |
21 | Appliances | Hylo-Forte 0.2% eye drops preservative free | 18 175 | £13,119,399 |
21 | Appliances | FreeStyle Libre 2 Sensor | 1 927 | £3,514,666 |
11 | Eye | Atropine 1% eye drops | 2 170 | £4,143,625 |
11 | Eye | Chloramphenicol 0.5% eye drops | 9 552 | £3,639,891 |
11 | Eye | Chloramphenicol 1% eye ointment | 15 760 | £3,552,518 |
08 | Malignant Disease and Immunosuppression | Adoport 1mg capsules | 2 647 | £10,692,071 |
08 | Malignant Disease and Immunosuppression | Adoport 0.5mg capsules | 1 387 | £3,355,469 |
08 | Malignant Disease and Immunosuppression | Mycophenolate mofetil 500mg tablets | 9 202 | £2,478,820 |
05 | Infections | Co-amoxiclav 500mg/125mg tablets | 55 125 | £6,566,015 |
05 | Infections | Flucloxacillin 500mg capsules | 41 584 | £6,135,470 |
05 | Infections | Nitrofurantoin 100mg modified-release capsules | 16 788 | £5,256,293 |
01 | Gastro-Intestinal System | Plenvu oral powder sachets | 2 231 | £6,082,030 |
01 | Gastro-Intestinal System | Moviprep oral powder sachets | 2 716 | £5,143,292 |
01 | Gastro-Intestinal System | Sulfasalazine 500mg gastro-resistant tablets | 4 816 | £5,035,670 |
The most prescribed medicines aka area of disease are Central nervous systems, Infections and Eye treatment. Indicating the prevalent disease areas in 2022 for people who get the prescriptions from hospitals.
%>% group_by(BNF.CHAPTER.CODE, BNF.CHAPTER.DESCRPT) %>%
hosp_pr_2022_post summarise(N.ITEMS = sum(TOTAL.ITEMS),
COST = sum(TOTAL.ITEMS.COST)) %>%
arrange(desc(N.ITEMS)) %>%
ungroup() %>%
#change the presentation of the numbers in table
mutate(N.ITEMS = scales::number(N.ITEMS),
COST = scales::dollar(COST, prefix = "£")) %>%
#rename the columns
rename("BNF Chapter\nCode" = BNF.CHAPTER.CODE,
"Chapter Description" = BNF.CHAPTER.DESCRPT,
"Number of Items" = N.ITEMS,
"Total Cost" = COST) %>%
#add a table
flextable() %>%
autofit() %>%
bold(part = "header") %>%
add_footer_lines("") %>%
add_header_lines("Table 3. Total number of prescriptions in each BNF chapter and its cost")
Table 3. Total number of prescriptions in each BNF chapter and its cost | |||
---|---|---|---|
BNF Chapter | Chapter Description | Number of Items | Total Cost |
04 | Central Nervous System | 1 711 497 | £459,156,957 |
05 | Infections | 398 647 | £31,145,176 |
11 | Eye | 249 645 | £33,656,583 |
02 | Cardiovascular System | 239 765 | £19,488,521 |
13 | Skin | 216 778 | £79,805,701 |
01 | Gastro-Intestinal System | 197 437 | £26,380,974 |
06 | Endocrine System | 179 162 | £19,560,811 |
21 | Appliances | 130 264 | £37,931,817 |
09 | Nutrition and Blood | 123 207 | £13,121,650 |
03 | Respiratory System | 119 621 | £12,006,112 |
10 | Musculoskeletal and Joint Diseases | 116 545 | £8,085,556 |
12 | Ear, Nose and Oropharynx | 84 935 | £8,294,682 |
08 | Malignant Disease and Immunosuppression | 59 172 | £31,492,260 |
07 | Obstetrics, Gynaecology and Urinary-Tract Disorders | 51 876 | £3,656,976 |
19 | Other Drugs and Preparations | 18 767 | £968,891 |
20 | Dressings | 8 377 | £389,353 |
15 | Anaesthesia | 7 303 | £1,129,120 |
23 | Stoma Appliances | 1 752 | £63,026 |
22 | Incontinence Appliances | 668 | £63,078 |
14 | Immunological Products and Vaccines | 12 | £578 |
18 | Preparations used in Diagnosis | 7 | £426 |
When we are looking at the total number prescribed it is methadone oral solution that is most frequent but less expensive compared to physeptone that is prescribed the second (but is much more expensive).
<- 7
n
<- hosp_pr_2022_post %>%
sorted_3 group_by(BNF.CHAPTER.PLUS.CODE, BNF.NAME) %>%
summarise(N.ITEMS = sum(TOTAL.ITEMS),
COST = sum(TOTAL.ITEMS.COST)) %>%
arrange(desc(N.ITEMS)) %>%
ungroup() %>%
slice(1:n)
#data transformation and table presentation
<- hosp_pr_2022_post %>%
top_number_by_chapt_tbl select(BNF.CHAPTER.PLUS.CODE, BNF.NAME, TOTAL.ITEMS, TOTAL.ITEMS.COST) %>%
mutate(BNF.NAME = as_factor(BNF.NAME) %>%
fct_lump(n = n, w = TOTAL.ITEMS)) %>%
group_by(BNF.NAME) %>%
summarise(N.ITEMS = sum(TOTAL.ITEMS),
COST = sum(TOTAL.ITEMS.COST)) %>%
ungroup() %>%
#I want to arrange it in the right order for a plot
mutate(BNF.NAME = BNF.NAME %>% fct_reorder(N.ITEMS)) %>%
mutate(BNF.NAME = BNF.NAME %>% fct_relevel("Other", after = 0)) %>%
arrange(desc(BNF.NAME)) %>%
#I want to add label to my plot
mutate(N.ITEMS_text = scales::number(N.ITEMS),
COST_text = scales::dollar(COST, scale = 1e-6, prefix = "£", suffix = "M")) %>%
#Cumulative percent
mutate(cum_pct = cumsum(N.ITEMS)/sum(N.ITEMS)) %>%
mutate(cum_pct_text = scales::percent(cum_pct)) %>%
#Rank
mutate(rank = row_number()) %>%
mutate(rank = case_when(
== max(rank) ~ NA_integer_,
rank TRUE ~ rank)) %>%
#Label text
mutate(label_text = str_glue("Rank: {rank}\nItems: {N.ITEMS_text}\nCost: {COST_text}"))
#join the column from another table
<- sorted_3 %>%
top_number_by_chapt_u_tbl select(BNF.CHAPTER.PLUS.CODE, N.ITEMS) %>%
full_join(top_number_by_chapt_tbl)
#I am not sure that vis is better here I prefer to create a table!
%>%
top_number_by_chapt_u_tbl select(BNF.NAME, BNF.CHAPTER.PLUS.CODE, N.ITEMS_text, COST_text) %>%
#flaxtable
rename("BNF Name" = BNF.NAME,
"BNF Chapter" = BNF.CHAPTER.PLUS.CODE,
"Number of Items" = N.ITEMS_text,
"Total Cost" = COST_text) %>%
#add a table
flextable() %>%
autofit() %>%
bold(part = "header") %>%
add_footer_lines("") %>%
add_header_lines(str_glue("Table 4. Top {n} most prescribed medicine sorted by number of items"))
Table 4. Top 7 most prescribed medicine sorted by number of items | |||
---|---|---|---|
BNF Name | BNF Chapter | Number of Items | Total Cost |
Methadone 1mg/ml oral solution sugar free | 04: Central Nervous System | 97 831 | £29.63M |
Physeptone 1mg/ml oral solution sugar free | 04: Central Nervous System | 76 060 | £59.91M |
Prednisolone 5mg tablets | 06: Endocrine System | 64 170 | £1.41M |
Co-amoxiclav 500mg/125mg tablets | 05: Infections | 55 125 | £6.57M |
Sertraline 50mg tablets | 04: Central Nervous System | 52 230 | £1.83M |
Methadone 1mg/ml oral solution | 04: Central Nervous System | 43 702 | £14.11M |
Folic acid 5mg tablets | 09: Nutrition and Blood | 42 106 | £1.02M |
Other | 3 484 213 | £671.93M | |
It should be emphasized that It is not the way to compare trusts between them as each of the hospital is specialise in different medical areas and can be bigger or smaller depending on the region and population.
%>% group_by(HOSPITAL.TRUST) %>%
hosp_pr_2022_post summarise(N.ITEMS = sum(TOTAL.ITEMS),
COST = sum(TOTAL.ITEMS.COST)) %>%
arrange(desc(COST)) %>%
ungroup() %>%
top_n(10) %>%
#this mutate() transform it to character, so no possible to plot but better visibility - if I want to build a table
mutate(N.ITEMS = scales::number(N.ITEMS),
COST = scales::dollar(COST, prefix = "£")) %>%
rename("NHS Trust Name" = HOSPITAL.TRUST,
"Number of Items" = N.ITEMS,
"Total Cost" = COST) %>%
#add a table
flextable() %>%
autofit() %>%
bold(part = "header") %>%
add_footer_lines("") %>%
add_header_lines(str_glue("Table 5. Top 10 NHS Trusts that prescribed the most considering cost"))
Table 5. Top 10 NHS Trusts that prescribed the most considering cost | ||
---|---|---|
NHS Trust Name | Number of Items | Total Cost |
MIDLANDS PARTNERSHIP NHS FOUNDATION TRUST | 122 509 | £108,654,040 |
SANDWELL AND WEST BIRMINGHAM HOSPITALS NHS TRUST | 113 434 | £45,699,559 |
ALDER HEY CHILDREN'S NHS FOUNDATION TRUST | 16 654 | £36,677,264 |
NORTH BRISTOL NHS TRUST | 34 003 | £34,966,829 |
CENTRA AND NORTH WEST LONDON NHS FOUNDATION TRUST | 64 742 | £27,624,692 |
SUSSEX PARTNERSHIP NHS FOUNDATION TRUST | 60 553 | £27,098,929 |
ESSEX PARTNERSHIP UNIVERSITY NHS FOUNDATION TRUST | 49 704 | £24,076,846 |
NORTH EAST LONDON NHS FOUNDATION TRUST | 38 709 | £22,308,378 |
EAST KENT HOSPITALS UNIVERSITY NHS FOUNDATION TRUST | 106 436 | £22,167,541 |
TEES, ESK AND WEAR VALLEYS NHS FOUNDATION TRUST | 102 025 | £19,982,558 |
It should be emphasized that It is not the way to compare trusts between them as each of the hospital is specialise in different medical areas and can be bigger or smaller depending on the region and population.
%>% group_by(HOSPITAL.TRUST) %>%
hosp_pr_2022_post summarise(N.ITEMS = sum(TOTAL.ITEMS),
COST = sum(TOTAL.ITEMS.COST)) %>%
arrange(desc(N.ITEMS)) %>%
ungroup() %>%
top_n(10) %>%
#this mutate() transform it to character, so no possible to plot but better visibility - if I want to build a table
mutate(N.ITEMS = scales::number(N.ITEMS, big.mark = ","),
COST = scales::dollar(COST, prefix = "£")) %>%
rename("NHS Trust Name" = HOSPITAL.TRUST,
"Number of Items" = N.ITEMS,
"Total Cost" = COST) %>%
#add a table
flextable() %>%
autofit() %>%
bold(part = "header") %>%
add_footer_lines("") %>%
add_header_lines(str_glue("Table 6. Top 10 NHS Trusts that prescribed the most considering number of items"))
Table 6. Top 10 NHS Trusts that prescribed the most considering number of items | ||
---|---|---|
NHS Trust Name | Number of Items | Total Cost |
MIDLANDS PARTNERSHIP NHS FOUNDATION TRUST | 122,509 | £108,654,040 |
SANDWELL AND WEST BIRMINGHAM HOSPITALS NHS TRUST | 113,434 | £45,699,559 |
EAST KENT HOSPITALS UNIVERSITY NHS FOUNDATION TRUST | 106,436 | £22,167,541 |
TEES, ESK AND WEAR VALLEYS NHS FOUNDATION TRUST | 102,025 | £19,982,558 |
CENTRA AND NORTH WEST LONDON NHS FOUNDATION TRUST | 64,742 | £27,624,692 |
SUSSEX PARTNERSHIP NHS FOUNDATION TRUST | 60,553 | £27,098,929 |
ESSEX PARTNERSHIP UNIVERSITY NHS FOUNDATION TRUST | 49,704 | £24,076,846 |
NORTH EAST LONDON NHS FOUNDATION TRUST | 38,709 | £22,308,378 |
NORTH BRISTOL NHS TRUST | 34,003 | £34,966,829 |
ALDER HEY CHILDREN'S NHS FOUNDATION TRUST | 16,654 | £36,677,264 |
The total expenses reach almost £800M and almost 4M items from January to October in 2022. It is interesting to observe the gradual increase month over month.
#create a data set grouped by month
<- hosp_pr_2022_post %>%
hosp_pr_2022_sum_month select(PERIOD, PERIOD.YEAR, PERIOD.MONTH, TOTAL.ITEMS, TOTAL.ITEMS.COST) %>%
#group by month so there are 10 groups now
group_by(PERIOD.MONTH) %>%
#summarise to get the sum of prescriptions by each month
summarise(monthly_items = sum(TOTAL.ITEMS),
monthly_cost = sum(TOTAL.ITEMS.COST)) %>%
ungroup()
#Now to do the calculations
%>%
hosp_pr_2022_sum_month #find the cumulative amount items prescribed and cumulative cost for each month
mutate(cumulative_items = cumsum(monthly_items),
cumulative_cost = cumsum(monthly_cost)) %>%
#transform it to percentage but they are exactly the same, so delete one
mutate(cumulative_items_pct = cumulative_items/sum(monthly_items),
cumulative_cost_pct = cumulative_cost/sum(monthly_cost)) %>%
#but it is a character
mutate(cumulative_items_pct_chr = scales::percent(cumulative_items_pct),
cumulative_cost_pct_chr = scales::percent(cumulative_cost_pct),
cumulative_items_chr = scales::number(cumulative_items)) %>%
#Label text
mutate(label_text = str_glue("N Items: {cumulative_items_chr}\nCumPct items: {cumulative_items_pct_chr}\nCumPct Cost: {cumulative_cost_pct_chr}")) %>%
#plot
ggplot(aes(PERIOD.MONTH, cumulative_cost, group = 1)) +
geom_point(size = 5, colour = "#2C3E50") +
geom_line(size = 1.5, colour = "#2C3E50") +
#labels
geom_label(aes(label = label_text),
hjust = "inward",
size = 2,
colour = "#2C3E50") +
scale_y_continuous(labels = scales::dollar_format(scale = 1e-6, prefix = "£", suffix = "M")) +
labs(title = "Fig. 2. Cumulative cost of the prescriptions by month in 2022",
x = "",
y = "Cumulative Cost"
+
)
theme_tq()
The three-month moving average is around £80M.
#I want to plot moving averages
%>%
hosp_pr_2022_sum_month mutate(roll_mean_3 = rollmean(monthly_cost, k = 3, na.pad = TRUE, align = "right")) %>%
#plot
ggplot(aes(x = PERIOD.MONTH, y = monthly_cost, group = 1))+ # start ggplot
geom_line( # plot raw values
size = 1,
alpha = 0.2 # semi-transparent line
+
)
#adjust the limits and label on the scale
scale_y_continuous(labels=scales::dollar_format(prefix = "£",
scale = 1e-6, suffix = "M"),
limits = c(40000000, 100000000)) +
#add moving averages using tidyquant package
geom_ma( # plot moving average
n = 3,
size = 1,
color = "blue")+
labs(title = "Fig. 3. Time series analysis of total cost of prescriptions by month in 2022",
subtitle = "Three months moving averages(blue line)",
y = "Total Items Cost", x = "") +
theme_tq()