AS-Level BGP Community Usage Classification

Welcome

This website provides access to our current research on BGP communities. If you are new to this website, we recommend to have a quick look at the introduction. If you are interested in the full technical details, please don't hesitate to read our IMC paper:

AS-Level BGP Community Usage Classification
Thomas Krenc, Robert Beverly and Georgios Smaragdakis, In ACM IMC'21

Please cite our work, if you use data sets or scripts from this website. For further questions and feedback, please contact us at communityexploration AT cmand DOT org.

Site contents:

Taxonomy: AS-Level Community Usage
Download: Community Usage Data Sets
Reproduce: Script & Input Data

Introduction

BGP communities are a popular mechanism used by network operators for traffic engineering, blackholing, and to realize network policies and business strategies. We present a passive algorithm to infer the community usage on a per-AS level. At its core, our algorithm implements a system of constraints that groups ASes according to their community tagging and forwarding behavior:

Tagging behavior
- Tagger - A tagger AS adds its own communities in a consistent and automated fashion, e.g., informational communities, and always forwards them on external links.
- Silent - A silent AS may use its own communities internally, but it does never forward them on external links.
Forwarding behavior
- Forward - A forward AS does not remove communities added by other tagger ASes and always forwards them on external links.
- Cleaner - A cleaner AS removes communities set by other tagger ASes, either upon receiving or upon forwarding on external links.

AS-Level Community Usage

To illustrate our mental model of community usage at the AS-level, in the following, we use simplified topologies with three ASes (X, Y and Z) and a collector. AS X consists of three routers: X1 is connected to AS Y, X2 is connected to AS Z, and the third router is connected to the collector.

Tagging behavior
Figure 1 illustrates the tagging behavior of AS X:



Figure 1: Example tagging behavior of AS X: Tagger.

Neighboring ASes Y and Z annouce their prefixes Y/24 and Z/23 towards routers X1 and X2, respectively. The individual prefixes are tagged with communities X:X1 and X:X2. Those communities can be used by AS X to, e.g., signal the ingress location to every internal router. In our example, AS X forwards it's own communities on an external link towards the collector, where we can actually observe the communities.

When an AS always forwards it's own communities on external links like in Figure 1 or similar, we call it a tagger.

Next, in Figure 2 we show the opposite tagging behavior of that of a tagger AS, i.e., when own communities are never forwarded on external links:



Figure 2: Example tagging behavior of AS X: Silent.

In the above example, neighboring ASes Y and Z again annouce their prefixes Y/24 and Z/23 towards routers X1 and X2, respectively. AS X can behave in the following ways: (a) AS X defines and adds it's own communities and removes them prior to forwarding the announcements, and (b) AS X does not use communities at all. In both cases, we do not observe any communities defined by AS X at the collector.

When an AS never defines or never adds it's own communities to the announcements, as for example in Figure 2, we call it a silent AS.

Forwarding behavior
Thus far, we have looked at the tagging behavior of ASes, i.e., whether they define and add their own communities and forward them on external links, or not. In the following examples, we look at the forwarding behavior of ASes. Specifically, we ask how ASes deal with communities other than their own.

Since our definitions of forwarding behavior require downstream taggers, in Figure 3 and 4, we assume neighboring ASes Y and Z to be tagger ASes, so they define, add and forward their communities on external links.



Figure 3: Example forwarding behavior of AS X: Forward.

In the above example, AS X receives routes that are tagged with communities defined by AS Y and Z. It forwards those communities internally, as well as on external links; here to the collector, where we can observe communities Y:Y1 and Z:Z1.

When an AS always forwards communities other than it's own on external links, like in Figure 3, we call it a forward AS.

Finally, we look at a scenario where AS X does never forward communities other than it's own on external links.



Figure 4: Example forwarding behavior of AS X: Cleaner.

In the above example, neighboring ASes Y and Z again tag their prefixes with communities Y:Y1 and Z:Z1, respectively. AS X can deal with those communities in the following ways: (a) AS X can remove those communities upon arrival, or (b) AS X can use those communities internally, but removes them prior to re-announcing the prefix. In both cases, we do not observe any communities defined by AS Y and Z at the collector.

When an AS never forwards other communities on external links, like in Figure 4, we call it a cleaner AS

Community Usage Data Sets

We provide data sets for AS-level community usage inferred by using the inference method in our paper.

Each of the community usage data sets below has been computed using our algorithm. As input to the algorithm, we utilize all available update files and RIBs for the respective day, from three major route collector projects: RIPE, RouteViews, and Isolario.

The script tries to infer the community usage behavior of every AS in the input data set. The output format is as followed:

ASN|class
ASN corresponds to the AS number of the AS, and class is a label that indicates the inferred class of that AS.

Community usage data sets
Name	Size	Last modified (UTC)
as2commusage.20170315.txt	510K	2021-10-23 01:40
as2commusage.20170615.txt	520K	2021-10-23 01:40
as2commusage.20170915.txt	529K	2021-10-23 01:40
as2commusage.20171215.txt	538K	2021-10-23 01:40
as2commusage.20180315.txt	547K	2021-10-23 01:40
as2commusage.20180615.txt	556K	2021-10-23 01:40
as2commusage.20180915.txt	565K	2021-10-23 01:40
as2commusage.20181215.txt	574K	2021-10-23 01:40
as2commusage.20190315.txt	584K	2021-10-23 01:40
as2commusage.20190615.txt	593K	2021-10-23 01:40
as2commusage.20190915.txt	605K	2021-10-23 01:40
as2commusage.20191215.txt	613K	2021-10-23 01:40
as2commusage.20200315.txt	624K	2021-10-23 01:40
as2commusage.20200615.txt	634K	2021-10-23 01:40
as2commusage.20200915.txt	644K	2021-10-23 01:40
as2commusage.20201215.txt	655K	2021-10-23 01:40
as2commusage.20210315.txt	661K	2021-10-23 01:40
as2commusage.20210615.txt	710K	2021-10-23 01:40
as2commusage.20210915.txt	718K	2021-10-23 01:40

Reproduce: Script & Input Data

You can reproduce our inferences documented in the paper using our communityusage.py script, as well as the pre-processed input data (688MB).

Disclaimer: The communityusage.py script does not sanitize or preprocess input data in any way. Artifacts like unallocated ASNs or prefixes need to be handled separately. To enable the reproduction of our results, we provide a pre-processed input data set that is sanitized according to the processing steps and methodology we detail in our paper.

The following commands download and process the input data:


# download the python code and the pre-processed input data:
wget https://www.cmand.org/communityusage/res/communityusage.py
wget https://www.cmand.org/communityusage/res/inputdata.bz2

# pipe input data into python script:
bzcat inputdata.bz2 | python3 communityusage.py > output.txt

The script communityusage.py reads input data via stdin with the following format:

AS PATH sequence|community set[|occurences]
The provided input data set contains AS paths and community sets from May 19, 2021 (as in the paper), aggregating all update files and RIBs from the three major route collector porjects: RIPE, RouteViews, and Isolario.


# print top 3 lines of the input data set
bzcat inputdata.bz2 | head -n3
44393 35710 4809 12389 29049 49666 12880 41689 43754 49100 1756||12
203478 206499 50629 174 46887 40339|206499|2
14630 6939 20473 213064|6939 14630|54

where 14630 6939 20473 213064 is the AS PATH sequence, with 14630 being the collector peer and 213064 the origin AS, 6939 14630 is the community set, specifically a unique set of upper 2/4 bytes (regular/large communities), and 54 is the number of occurrences (optional, defaults to '1').

If you want to use a different input data set, you will have to pre-process it accordingly, or adapt the script.

The output format of the resulting file output.txt is as followed:

ASN|class
ASN corresponds to the AS number of the AS, and class is a label that indicates the inferred class of that AS.

Classification

To indicate the class of an AS, in our data sets we use a label consisting of two characters: We assign the first character according to the tagging behavior:

t if the tagging behavior is tagger or
s if the tagging behavior is silent.

Further, to assign the forwarding behavior, we use:

f if the forwarding behavior is forward or
c if the forwarding behavior is cleaner.

In both cases, we assign u if the behavior is undecided, or n if the there are no counters to make any inference.

Since each AS has a tagging and forwarding behavior, we assign a 2-character label as follows:

tf = tagger and forward
tc = tagger and cleaner
sf = silent and forward
sc = silent and cleaner

or any other combination [tsun][fcun]. For example, sn indicates that the tagging behavior of an AS is silent while the forwarding shows no counters, and uf indicates that the tagging behavior is undecided, while the forwarding behavior is forward.

Referencing our measurement study

If you are using data from our work in your publication, please cite it with the following reference:

@inproceedings{krenc2021AScomm,
   title = {AS-Level BGP Community Usage Classification},
   author = {Krenc, Thomas and Beverly, Robert and Smaragdakis, Georgios},
   booktitle = {Proceedings of the 2021 ACM Internet Measurement Conference},
   year = {2021},
   month = nov,
   doi = {10.1145/3487552.3487865},
   location = {Virtual Event}
}

Background

BGP Communities The BGP Communities Attribute is part of the update message and is used to label multiple announced prefixes with common properties. It is a variable-length attribute and can store multiple communities. Also, the community attribute is a transitive attribute which allows communities to be propagated across multiple ASes.
A regular BGP community is simply a 32-bit integer that is denoted in the form α:β where typically α represents the AS that defines the value β. Thus, every AS can define its own values without collisions. This convention applies only to encoding 16-bit ASNs. In order to accommodate 32-bit ASes as well, the BGP Large Communities Attribute has been introduced. A large community is a 3x32-bit integer denoted in the form α:β:γ while α represents 32-bit ASN and β and γ are additional 64 bits for the community value.

Change log

October 21 2021: added various resources.
June 7 2021: init.

Last modified: March 10 2022 20:28 (UTC)