top of page

Companies House Charts (Day 4 of 5)

My exploratory data analysis of Companies House data in August 2020:

Length of company name

by year of incorporation



Next the limitations:

  • Only the latest name is plotted, not any previous names


And the code snippet:

#get labels for bubble of interest
data <- data %>%
mutate(
  CompanyName_length = str_length(CompanyName),
  IncorporationDate_jul = as_date(IncorporationDate_ymd)
) %>%
mutate(
  label_text = str_glue("{CompanyName}")
) %>%
select(-CompanyName,-IncorporationDate_ymd)

bubble1 <- data %>%
filter(
  CompanyName_length > 60,
  IncorporationDate_jul > ymd("2000-01-01"),
  IncorporationDate_jul < ymd("2010-01-01")
) %>% 
pull(CompanyNumber) %>%
sample(5)

#exploratory method, could be faster:)
data[!data$CompanyNumber %in% bubble1,"label_text"] <- ""

Recent Posts

See All

Improving Excel with Python (May 2022)

Revisited starter script from January 2021: Split Excel file into separate files Excel is essential, and Python is the future - forcing ourselves to practice the latter by automating some of the commo

Message us or

Call us on +44 (0)20 3287 8283

Mon to Fri: 8am-8pm

Weekends: 10am-6pm

bottom of page