Companies House Charts (Day 5 of 5)
My exploratory data analysis of Companies House data in August 2020:
Company status by month
Next, the limitations:
The delay in processing company status updates is likely to vary across cities
Status categories aren't clean, especially when a receiver is instructed
And the code snippet:
#group and count each status for each month
temp %>%
filter(CompanyStatus!="Active") %>%
group_by(path_ym,CompanyStatus) %>%
summarise(.groups = "keep",
path_CompanyStatus_count = n(),
) %>%
ungroup() %>%
mutate(
CompanyStatus = CompanyStatus %>% as_factor() %>% fct_reorder(path_CompanyStatus_count) %>% fct_rev(),
CompanyStatus_num = CompanyStatus %>% as.numeric()
) %>%
filter(CompanyStatus_num<=6) %>%
select(-CompanyStatus_num) %>%
mutate(path_ym = ymd(path_ym)) %>%
#graph
ggplot(aes(x=path_ym,y=path_CompanyStatus_count)) +
geom_line() +
facet_wrap(~CompanyStatus,scales = "free_y") +
#formatting
labs(
title = "",x="",y=""
) +
theme_tq() +
theme(
legend.position = "none"
)
Recent Posts
See AllRevisited starter script from January 2021: Split Excel file into separate files Excel is essential, and Python is the future - forcing...
Comments