Thursday, August 20, 2015

InterTubes: A Study of the US Long-haul Fiber-optic Infrastructure

Ramakrishnan Durairajan (University of Wisconsin - Madison), Paul
Barford (University of Wisconsin - Madison and comScore, Inc.), Joel
Sommers (Colgate University), Walter Willinger (NIKSUN, Inc.)

Paper, public review, and datasets through the PREDICT program.

Ram discussed the Internet combines many technologies, but that we seldom study its physical infrastructure.  This work is about how the US long-haul fiber infrastructure looks like, how resilient it is, how it impacts risks, and how we can improve it.  Ram mentioned that no one has a complete view of the Internet at a physical level, and quoted Ted Stevens's "The Internet is just a series of tubes."

Ram explained the process used to build the map, which anyone can follow given effort and time: (i) first build an initial map from geocoded ISP topology maps (11 ISPs), (ii) validate with other sources, (iii) extend initial map with non-geocoded ISP topology maps (9 ISPs), (iv) infer shared conduits (using similar sources as those used in (ii)).  Maps are built using ArcGIS, a tool used by geographers.  This is the final result:

Ram showed a few examples of data mining, consistency checks, and discussed a few properties of the map (e.g., that it relates to road and railway meshes).  Ram then discussed risk induced by infrastructure sharing, he mentioned critical choke points shared by many (17+) ISPs exist, and that the physical connectivity lacks diversity observed at higher layers.  Ram also showed that the majority of ISPs trade lower costs for decreased resilience (i.e., share infrastructure).

Ram also discussed approaches to improve physical connectivity.  For example, ISPs can reduce infrastructure sharing without increasing path length significantly.  Finally, Ram mentioned the implications of this work on policy-making.


David Clarke: You mentioned 18 ISPs were sharing a conduit, but that does not tell how much harm a failure there causes.

A: It depends on the intradomain topology.  What we can do is associate a node with close-by population centers, and we can related the impact of a cut to the population.

David: Microsoft and Google have networks that look larger than Level 3's.  Have you tried looking at their networks?

A: We have Level 3, but not the others.

Q: Why are ISPs and companies hiding this information that we can figure out easily?  What do you think about this opacity?  Your results are particular to the US.  What would happen if you looked at a different country?  Any comments on what is specific to the US.

A: Ram showed fiber maps for Africa and Estonia, mentioned they shared similarities and could be done for other countries.  He guessed opacity might be related to public image.

Keith Weinstein: When you validated your dataset, what percentage of the links were you able to validate?

A: More than 95%.  There were maps from several locations that we could not validate.

Q: Connectivity for research networks (ESNET and I2)?  These networks might be more amenable to sharing information with you.  Are there strategies or differences for these networks?

A: We did not include this data at this time.

Nick Feamster: I guess you are aware of Sean Gorman's work.  Was there a change of direction in the trend to classify this information?  Did you see any difference between his analysis and this?

A: We are aware of Sean's work.  All of the data use use is publicly available.  We do not point to specific locations.

Nick: Do you have any information about FTTH and FTTN deployments?  It might be useful to join with this information.

A: Yes, there is information.

Anja Feldmann: You said that getting the information was easy?  Was it really?

A: (Laughs.) Lots of searching and lots of coffee.

Anja: What would be a good dataset for estimating traffic?

A: Previous research; we used number of traceroutes going through conduits.