London Underground Network Study

Rouying Tang
4/29/2020

Introduction:
The city of London is facing a huge challenge regarding the COVID-19 Pandemic. According to the data provided by the Greater London Authority (GLA), by 24 April, the total number of COVID-19 deaths reported in London hospitals was 4,606, compared to 18,420 deaths for U.K. hospitals nationwide (GLA). Roughly, a quarter of deaths occurred in London.
COVID-19 is an infectious disease. The World Health Organization states that “according to current evidence, COVID-19 virus is primarily transmitted between people through respiratory droplets and contact routes (World Health Organization[where??])”. Public transport locations are potentially dangerous with a higher likelihood to transmit and contract the virus. Well known as the earliest underground system in the world, the London Tube has served Londoners for hundreds of years, but the gathering of people at stations is risky during a pandemic. In reality, however, not all stations have the same risk, as it is dependent on throughput and location.

Research Question:
The aim of this project is to identify the London underground stations that are potentially facing larger risks.

Rationale:
By examining the centralities of the London underground network, the stations bearing a larger possibility to transmit the disease than others can be identified.
In this project, I would like to consider three factors, the degree centrality, closeness centrality, and betweenness centrality, to detect the most important or central stations in the London underground network. It indicates those stations have a higher risk to transmit the virus.
Degree centrality is measured by the definition that the station with a higher degree is more central. It determines how many stations are connected to a station. When more stations are connected to a single station, it is very likely that more passengers will pass through this station for changing trains.
Closeness centrality can be defined as follow: the shorter its total distance to all other stations, the more central a station is. It determines how long it will take for passenger transporting from this station to all other stations sequentially.
Betweenness centrality is measured when a station has a larger possibility to act as a hub along the shortest path between two other stations, which are not directly connected. It determines how necessary a passenger needs to pass a station to get to the destination.
Data and Methods:
Data sources:
The “london-underground.xml” file was released by DomWeldon on Feb 20, 2016. It shows the London Underground Network built using the data from Wikipedia. It contains 306 nodes and 353 edges with an undirected Graph Type. The file was downloaded from the webpage DomWeldon/LondonUndergroundNetwork on 2020/4/25 from the website: https://github.com/DomWeldon/LondonUndergroundNetwork.

Results:
The Network “All stations ranked by Degree Centrality”:


“The stations with 6-7-degree Degree Centrality” :


The Network “Closeness Centrality for all stations”:


“The stations with top 12 closeness centrality”:


The Network “All stations ranked by Betweenness Centrality”:


“The stations with top 10 betweenness centrality”:


The average degree is 2.307. The network diameter is 38.
The stations connect to 7 stations are “King’s Cross St. Pancras”, and “Baker Street”.
The stations that connect to more than 5 stations are “Canning Town”, “Earl’s Court”, “Green Park”, “Liverpool Street”, “Oxford Circus”, “Paddington”, “Shadwell”, “Turnham Green”, “Waterloo”, “King’s Cross St. Pancras”, and “Baker Street”.
The top 12 stations with the largest closeness centrality are “Hyde Park Corner”, “Bond Street”, “Westminster”, “Piccadilly Circus”, “Oxford Circus”, “Embankment”, “Green Park”, “St. James’s Park”, “Victoria” “Baker Street”, and “Bank” and “Waterloo”.
The top 10 stations with the largest Betweenness centrality are “Baker Street”, “Bethnal Green”, “Bond Street”, “Westminster”, “Mile End”, “Stratford”, “Green Park”, “Baker Street”, “Liverpool Street”, and “Waterloo”.

Conclusion:
On average each station is connected to 2.307 stations.
“Waterloo”, “Green Park”, and “Baker Street” stations are related by three factors. Those stations are connected to more than 5 stations and appear high closeness centrality and betweenness centrality, which indicates that those stations are potentially facing the highest risks. The passengers should try their best to avoid those stations and the governments should implement the strictest measures there.
“Liverpool Street”, “Oxford Circus”, “Bethnal Green”, “Bond Street”, “Mile End”, “Westminster”, “Stratford”, stations are influenced by two factors. They are also risky but less risky than the previous three stations.
“Bank” “Victoria” “St. James’s Park”, “Embankment”, “Piccadilly Circus”, “Hyde Park Corner”, “Canning Town”, “Earl’s Court”, “Paddington”, “Shadwell”, “Turnham Green”, “King’s Cross St. Pancras” meet one factor. They are less risky than the previous 10 stations.
All other stations are relatively less risky, but it does not mean they are not risky or safe for traveling.

Discussion and Limitation:
This project is meant to prioritize the risks of the stations for better allocating the resources and labor forces. Three factors, degree centrality, closeness centrality, and betweenness centrality, are considered to indicate how risky the station is.
The data source is not assured because it was derived from the personal resource and built according to Wikipedia. The data was lastly updated on Feb 20, 2016, which may have inconsistencies with the circumstance in the present. But it is the only dataset available. The conclusion of this project is only a rough guide and does not intend to solve the real issue in London due to the limitation of the data source.
For degree centrality, it only counts the immediate connections one station has. One station may have lots of connections to other stations, but it is possible that this station locates at the central in a local neighborhood but remoted from the network as a whole at the same time.

Citation Sources:
(GLA), Greater London Authority. 2020. Coronavirus (COVID-19) Deaths. April 21. Accessed April 26, 2020. https://data.london.gov.uk/dataset/coronavirus–covid-19–deaths.
World Health Organization. (2020). Modes of transmission of the virus causing COVID-19: implications for IPC precaution recommendations: scientific brief, 29 March 2020. World Health Organization. https://apps.who.int/iris/handle/10665/331616. License: CC BY-NC-SA 3.0 IGO