AS-Level BGP Community Usage Classification |
Welcome |
This website provides access to our current research on BGP communities. If
you are new to this website, we recommend to have a quick look at the introduction. If you are interested in the full
technical details, please don't hesitate to read our IMC paper:
Please cite our work, if you use data sets or scripts from this website. For further questions and feedback, please contact us at communityexploration AT cmand DOT org. Site contents: |
Introduction |
BGP communities are a popular mechanism used by network operators for traffic engineering, blackholing, and to realize network policies and business strategies. We present a passive algorithm to infer the community usage on a per-AS level. At its core, our algorithm implements a system of constraints that groups ASes according to their community tagging and forwarding behavior: |
|
AS-Level Community Usage |
Community Usage Data Sets |
We provide data sets for AS-level community usage inferred by using the inference method in our paper. |
Each of the community usage data sets below has been computed using our
algorithm. As input to the algorithm, we utilize all available update files and
RIBs for the respective day, from three major route collector projects: RIPE,
RouteViews, and Isolario. The script tries to infer the community usage behavior of every AS in the input data set. The output format is as followed: ASN corresponds to the AS number of the AS, and class is a label that indicates the inferred class of that AS.
|
Reproduce: Script & Input Data |
You can reproduce our inferences documented in the paper using our communityusage.py script, as well as the pre-processed input data (688MB). |
Disclaimer: The communityusage.py script does not sanitize or
preprocess input data in any way. Artifacts like unallocated ASNs or prefixes
need to be handled separately. To enable the reproduction of our results, we
provide a pre-processed input data set that is
sanitized according to the processing steps and methodology we detail in our paper. The following commands download and process the input data:
# download the python code and the pre-processed input data:
wget https://www.cmand.org/communityusage/res/communityusage.py
wget https://www.cmand.org/communityusage/res/inputdata.bz2
# pipe input data into python script:
bzcat inputdata.bz2 | python3 communityusage.py > output.txt
The script communityusage.py reads input data via stdin with the following format: The provided input data set contains AS paths and community sets from
# print top 3 lines of the input data set
bzcat inputdata.bz2 | head -n3
44393 35710 4809 12389 29049 49666 12880 41689 43754 49100 1756||12
203478 206499 50629 174 46887 40339|206499|2
14630 6939 20473 213064|6939 14630|54
where 14630 6939 20473 213064 is the AS PATH sequence, with 14630 being the collector peer and 213064 the origin AS, 6939 14630 is the community set, specifically a unique set of upper 2/4 bytes (regular/large communities), and 54 is the number of occurrences (optional, defaults to '1'). If you want to use a different input data set, you will have to pre-process it accordingly, or adapt the script. The output format of the resulting file output.txt is as followed: ASN corresponds to the AS number of the AS, and class is a label that indicates the inferred class of that AS. |
Classification |
To indicate the class of an AS, in our data sets we use a label consisting of two characters:
We assign the first character according to the tagging behavior: t if the tagging behavior is tagger or s if the tagging behavior is silent. Further, to assign the forwarding behavior, we use: f if the forwarding behavior is forward or c if the forwarding behavior is cleaner. In both cases, we assign u if the behavior is undecided, or n if the there are no counters to make any inference. Since each AS has a tagging and forwarding behavior, we assign a 2-character label as follows: tf = tagger and forward tc = tagger and cleaner sf = silent and forward sc = silent and cleaner or any other combination [tsun][fcun]. For example, sn indicates that the tagging behavior of an AS is silent while the forwarding shows no counters, and uf indicates that the tagging behavior is undecided, while the forwarding behavior is forward. |
Referencing our measurement study |
If you are using data from our work in your publication, please cite it with the following reference:@inproceedings{krenc2021AScomm,
title = {AS-Level BGP Community Usage Classification},
author = {Krenc, Thomas and Beverly, Robert and Smaragdakis, Georgios},
booktitle = {Proceedings of the 2021 ACM Internet Measurement Conference},
year = {2021},
month = nov,
doi = {10.1145/3487552.3487865},
location = {Virtual Event}
}
|
Background |
BGP Communities
The BGP
Communities Attribute is part of the update message and is used to label
multiple announced prefixes with common properties. It is a variable-length
attribute and can store multiple communities. Also, the community attribute is
a transitive attribute which allows communities to be propagated across
multiple ASes. A regular BGP community is simply a 32-bit integer that is denoted in the form α:β where typically α represents the AS that defines the value β. Thus, every AS can define its own values without collisions. This convention applies only to encoding 16-bit ASNs. In order to accommodate 32-bit ASes as well, the BGP Large Communities Attribute has been introduced. A large community is a 3x32-bit integer denoted in the form α:β:γ while α represents 32-bit ASN and β and γ are additional 64 bits for the community value. |
Change log |
Last modified: March 10 2022 20:28 (UTC) |