‘The Concert Project’ : Going Places

Let’s talk about location. The live music industry revolves around touring and location is a very important factor in planning a tour. Population centers can be assumed to host a lot of tours as there are more potential concertgoers in the area. We’d imagine a lot of events would happen in these high-population areas. There might be, however, other smaller population areas where touring is popular. This might be because it is a good stopover point while traveling between larger cities, or it has a high demand for live music. These ideas will be explored in this and the next blog post.

But first, let’s take a peek at the artists who are doing the touring.



Loading

Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

view raw

Blog_3_1.ipynb

hosted with ❤ by GitHub

Apologies for the horrible table. It looked better outside of GitHub (I swear)! I can see, however, that that out top 15 tourers each have between 73 and 177 events in the data set. The names look familiar, too. Many of them are male contemporary country artists. The top act is “Florida Georgia Line”.

Though it visually looks bad, the biggest problem with this output is actually that it is not weighted by the number of shows. This output is merely the number of rows attributed to each artist, but many events are comprised of multiple shows in a single location – either a weekend or a short stay in a larger metro.

Below I take a peek at shows, revenue, and tickets sold.



Loading

Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

view raw

Blog_3_2.ipynb

hosted with ❤ by GitHub

My markdown explores some of my stream-of-consciousness first impressions of the data. My initial reaction to the first table is that there is something funky going on with my assumption that the data is touring data. It seems apparent that these top four acts were not touring. They were likely in a single location and performing many shows under single ‘events’ – possibly lumping entire months together into a single event. This static location is likely Las Vegas. Interestingly, if I have captured every show that Donny & Marie have performed in the four-year period of the study (which I likely haven’t since there is assumed under-reporting in the data), then it could be calculated that they performed a show at least once every three days, on average. That is incredible work ethic!

The next two tables look at the tickets sold and the total revenues earned in the time-frame. The names on the ‘Tickets’ version of the table look much more familiar to the first look at the data. Lots of country artists, a few jam bands, and pop-star type artists near the bottom. Amazingly, in four years Luke Bryan sold (at least) 3.1 million tickets. If he never sold more than one ticket to the same person, that could be a ticket for nearly 1 in every 100 people in the US!

However many tickets Luke Bryan sold, he only made 75% as much as Beyoncé – and she only sold 60% as many tickets! She and U2 head to the front of the pack when revenue is analyzed. So now that I can see who is on the ‘leader board’, I’m interested in where these artists are performing.



Loading

Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

view raw

Blog_3_3.ipynb

hosted with ❤ by GitHub

Again, apologies about the state of this output. And I’ll reiterate one more time – there are likely reporting biases. Some shows may not be reported, and many of those shows may be from smaller venues and smaller promoters who probably have less at stake and value this kind of data reporting less. These biases may then correlate both with artists represented by these under-reporters, but also venues (and therefor cities).

Disclaimer and apology out of the way – LA is the winner. It is a HUGE metro with a pension for the entertainment industry. Caveat, again, this is the number of rows not weighted by the number of shows, however it does give a good idea of where tours and single shows are likely taking place. The top five cities actually each represent major geographic regions – West Coast, South West, South East, North East, and Midwest.

Other cities start filling out the rest of the picture, some of which are iconic music towns like Nashville and Chicago. To really make a picture, though, I need to get these points on a map. Using the Google Maps API I was able to map the latitude and longitude to my list of cities, and using openly available government data I was able to capture the population of the county in which that city resides, the per-capita personal income of the county, and the % of that state’s entertainment business happening in that county (named ‘Hip Score’ – I’ll go into that more later). From there I made a few simple maps.



Loading

Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

view raw

Blog_3_6.ipynb

hosted with ❤ by GitHub

Using the latitude and longitude columns in the ‘venues’ data I am able to create the ‘Geocode’ variable, which is a 2-column table of geopoints. And with a little magic from ggplot – we have a population density map! Joking aside, it is cool to see where the concerts are taking place. We can see some sparse regions with concerts – maybe one-offs or possibly festivals in rural areas, but most of our concerts are happening where people are. Next time I’ll join our venue data with our event data and start to really take a peek under the hood of what is going on. Until then – thanks for reading!