BESS【9】BESS Scheduler

Modules and Tasks

Let's take a look at simple pipeline from a sample script (bess/bessctl/conf/samples/acl.bess):

localhost:10514 $ show pipeline 
+---------+                    +----------+                    +-----+                    +-------+
| source0 |  :0 200285248 0:   | rewrite0 |  :0 200257248 0:   | fw  |  :0 100117232 0:   | sink0 |
| Source  | -----------------> | Rewrite  | -----------------> | ACL | -----------------> | Sink  |
+---------+                    +----------+                    +-----+                    +-------+

The modules classes above can be divided into two categories, based on their behavior:

  • The Rewrite , ACL and Sink classes are only called by their left neighbor when there are packets to process.
  • The Source module class is called periodically and generates packets on its own.

The Source module class behaves differently because its instances create a task.

The QueueInc class behaves a lot like Source: it registers a task that periodically gets called to read packets from a Port rxq. Remember: BESS ports operate in polling mode, to avoid interrupt overhead.

The PortInc module is very similar to the QueueInc module, except that it may register more tasks (one per each rxq of the Port).

A module class is not forced to choose between receiving packets and generating them: there are classes that mix the two behaviors. Let's take a look at another pipeline (bess/bessctl/conf/samples/queue.bess):

localhost:10514 $ show pipeline 
+--------+                   +----------+                   +-----------+                  +-------------------+                  +-------+
|  src   |                   | rewrite0 |                   |   queue   |                  |    vlan_push0     |                  | sink0 |
| Source |  :0 26043968 0:   | Rewrite  |  :0 26040608 0:   |   Queue   |  :0 2897536 0:   |     VLANPush      |  :0 2898400 0:   | Sink  |
|        | ----------------> |          | ----------------> | 1023/1024 | ---------------> | PCP=0 DEI=0 VID=2 | ---------------> |       |
+--------+                   +----------+                   +-----------+                  +-------------------+                  +-------+

The queue module above (instance of Queue, very different from QueueInc!), receives packets generated from a task registered by src, but also registers its own task. It doesn't immediately forwards the packets received, but it stores them in a ring buffer. The task created by queue will later read packets from the ring buffer and forwards them to its right neighbor.

There are two different tasks in the pipeline. How often does the src task get called? How often does the queue task get called?

The BESS scheduler

BESS implements a fully hierarchical task scheduler that supports different policies. The job of the scheduler is to decide which task needs to be executed next. The tasks in the scheduler are organized in a tree-like data structure, where the leaf nodes are the tasks itself, and the other nodes represents particular policies.

When the user or the author of the BESS script doesn't configure the scheduler, the execution of all the tasks is interleaved in a round robin fashion. The scheduler tree can be examined using the show tc command:

localhost:10514 $ show tc
<worker 0>
  +-- !default_rr_0            round_robin
      +-- !leaf_src:0          leaf
      +-- !leaf_queue:0        leaf

The above command shows that we have only one thread (worker 0) with a very simple tree: there's a root node (called !default_rr_0) with type round_robin and two children (!leaf_src:0) and (!leaf_queue:0), which are the two tasks registered by the src and queue modules. In this case the scheduler behavior is very simple: it will simply alternate the execution of src and queue over and over.

How to configure the scheduler

Add rate limiting

The rate of the execution of a task can be throttled with a 'rate_limit' node in the scheduler tree. We can create the node with a line in the BESS configuration script:

bess.add_tc('fast', policy='rate_limit', resource='packet', limit={'packet': 9000000})

If we inspect the tree now we see:

localhost:10514 $ show tc
<worker 0>
  +-- !default_rr_0            round_robin
      +-- !leaf_src:0          leaf
      +-- !leaf_queue:0        leaf
      +-- fast                 rate_limit          9.000 Mpps

The newly created note doesn't have any effect on the src or queue tasks, because they're still under the round_robin policy. To actually enforce the limit on a task, we have to make it a child of the rate_limit node, using this code:

src.attach_task('fast')

With the above line we tell the module src to attach its task under the 'fast' policy. Now the scheduler tree looks like:

localhost:10514 $ show tc
<worker 0>
  +-- !default_rr_0            round_robin
      +-- !leaf_queue:0        leaf
      +-- fast                 rate_limit          9.000 Mpps
          +-- !leaf_src:0      leaf

Similarly, we can also limit the execution of the task registered by the queue module to a slower rate with these two lines:

bess.add_tc('slow', policy='rate_limit', resource='packet', limit={'packet': 1000000})
queue.attach_task('slow')

The final tree looks like this:

localhost:10514 $ show tc
<worker 0>
  +-- !default_rr_0            round_robin
      +-- fast                 rate_limit          9.000 Mpps
      |   +-- !leaf_src:0      leaf
      +-- slow                 rate_limit          1.000 Mpps
          +-- !leaf_queue:0    leaf

BESS python API references

The following functions are used to interact with the scheduler from a BESS script:

  • bess.add_tc(name, policy, wid=-1, parent='', resource=None, priority=None, share=None, limit=None, max_burst=None)

    Create a new node in the scheduler tree called name of type policy. name must be a unique string that identifies the node (it cannot start with '!'). policy can be one of the following:

    • 'round_robin':
      Each time this node is visited by the scheduler, a child is picked in a round robin fashion.
    • 'weighted_fair':
      The children of this nodes are executed in proportion to their share.
    • 'rate_limit':
      The node can have at most one child. When the child execution exceeds the limits imposed by limit and max_burst the node will be put in a blocked state (it will be unblocked after an appropriate amount of time).
    • 'priority'
      The node always schedules the child with the highest priority that's not blocked (in the sense described by the rate_limit node.). If a node has no children, or if all its children are temporarily blocked, it is considered blocked itself.
    • 'leaf'
      Nodes with this policy represent a task. They cannot be created with add_tc, they're added by the modules when a task is registered.

    The wid and parent arguments control where to place the new node in the tree. They're mutually exclusive, i.e. only one of them can have a non default value. If parent is specified, its value must be the name of an existing node in the tree: the newly added node will become one of its children. If wid is specified, the newly added node will become the root of the tree on worker wid; if there's a root already, the roots will be placed under a round robin node named 'default_rr_<wid>'. If also wid is unspecified (i.e. -1), the worker will be chosen in a round robin fashion.

    The weighted_fair or rate_limit policies can (respectively) share among children or limit different types of resources. The resource parameter must be used when creating one of them to choose. It can be:

    • count: The new node will share fairly or limit the number of times a child is scheduled.
    • cycle: The new node will share fairly or limit the number of cycles (as measured by the TSC) a child's execution takes.
    • packet: The new node will schedule the children nodes to try to share fairly or to limit the number of packets generated.
    • bit: The new node will schedule the children nodes to try to share fairly or to limit the number of bits generated.

    The next two parameters are only used when attaching to certain types of parent nodes. The priority parameter control the priority that this node has among the children of a priority parent (a lower number means higher priority). The share parameter controls the relative share of the resource among the children of a weighted_fair parent.

    limit and max_burst are only used when creating rate_limit nodes: they control the rate of the resource used and the excess allowed. They must be in the form of an object with the resource as key (which must be the same as the resource parameter) and an integer as value (e.g. {'packet': 1000}). The node will schedule its children to consume no more than limit units per second.

  • <module>.attach_task(parent='', wid=-1, module_taskid=0, priority=None, share=None)

    bess.attach_task(module_name, parent='', wid=-1, module_taskid=0, priority=None, share=None)

    Moves a task in the scheduler tree. The two forms are equivalent.

    The task to move is the task numbered module_taskid (it is usually 0), registered by a module. The module is identified by the <module> object in the first form, or by module_name in the second form.

    parent and wid behave like in bess.add_tc().

    priority and share behave like in bess.add_tc().

  • update_tc_params(self, name, resource=None, limit=None, max_burst=None)

    Update the parameters of the node named name in the scheduler tree. Only weighted_fair and rate_limit nodes have parameters that can be updated.

    resource, limit and max_burst behave like in bess.add_tc()

最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 205,033评论 6 478
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 87,725评论 2 381
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 151,473评论 0 338
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 54,846评论 1 277
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 63,848评论 5 368
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 48,691评论 1 282
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 38,053评论 3 399
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 36,700评论 0 258
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 42,856评论 1 300
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 35,676评论 2 323
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 37,787评论 1 333
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 33,430评论 4 321
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 39,034评论 3 307
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 29,990评论 0 19
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 31,218评论 1 260
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 45,174评论 2 352
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 42,526评论 2 343