Stride Scheduling
=================

What's wrong with BSD scheduler?
  Hard to have isolation / prevent interference
    Priorities are absolute
  Can't transfer priority (e.g., to server on RPC)
  No flexible control
    E.g., In monte carlo simulations, error is 1/sqrt(N) after N trials
    Want to get quick estimate from new computation
    Leave a bunch running for a while to get more accurate results
  Multimedia applications
    Often fall back to degraded quality levels depending on resources
    Want to control quality of different streams

What's wrong with lottery scheduling?
  Randomization:  Expected Relative and absolute O(sqrt(n_a))
		  for n_a allocations
  What are relative & absolute error.  How do you calculate these?
    E.g., process A should have had 1/3 of CPU
	  yet after 1 minute has had only 19 seconds
    Absolute error = sum of all such errors
    Relative error = difference between a pair of processes
  Why does lottery scheduling have O(sqrt(n_a)) error?
     Probability of getting k out of n quanta is binomial distribution:
        (n choose k) * p^k * (1-p)^{n-k}
        [n choose k = n!/(k!(n-k)!)]
     For large n, binomial distribution approximately normal
     Variance for a single allocation:
       p(1-p)^2 + (1-p)p^2 = p(1-p)(1-p+p) = p(1-p)
     Variance for n allocations = np(1-p), stddev O(sqrt(n))

What is stride scheduling?
  Idea:  Every process has:
    tickets = priority assigned by admimistrator
    stride = stride1 / tickets
    pass = roughly how much CPU time used
  Schedule process c with lowest stride, then set c->pass += c->stride
  Example:  Look at figure 2
  Benefits:
    Good control over resource allocation
    Can transfer tickets on an IPC
    Currency, inflation - nice model for divvying up resources amongst users
      letting users then sub-allocate what resources they have

What happens if you give tickets to a process or remove them?
  Can you just recompute the stride and be done with it?  No.
    E.g., if low priority process gets more tickets, still long latency
      Bad for IPC, since ticket transfer won't help you much
  Idea:  Remaining fraction of your pass should be at new ticket rate
    c->tickets = new_tickets; c->stride = stride1 / c->tickets
    remain = c->pass - global_pass
    remain *= new_stride / old_stride
    c->pass = global_pass + remain

What if you go to sleep and don't use your full quantum?
  If you use fraction f of time quantum, advance pass by f * stride
  To go to sleep, why not just set your # tickets to zero?
    Sort of what you want to do, except would require division by zero

Does it work?  What are we seeing in Figure 9?
  Seems like lottery scheduling drifts from allocations
    Curves look O(sqrt(n_a)), so this is as expected
  Why do we care?  Is this useful?

What questions should we be asking when we evaluate stride scheduling?
  How will it impact applications we actually care about?  In particular:
    1. Overall system throughput
    2. Response time for applications that need it
    3. Fairness / adherence to prescribed policy
  How does the system stack up?
    1. Maybe 0.2% slower than Linux... but both schedulers suboptimal
         (So probably we don't care here, if no one bothered to optimize)
    2. What evidence do we have that response time is good?
         Figures 11 and 12 -- looks better than Lottery scheduling
         How would BSD scheduler have done?
	   Better, but of course allocations would have been 1:1
         Are we interested in these workloads?
	 If I'm compiling tcpproxy, will my emacs respond quickly?
	   Hard to say based purely on data in the paper
    3. How about adherence to policy?
         Looks pretty good, but only on the workloads they show

What happens if 100 processes have ticket allocations 100:1:1:...:1
  High priority process P0 runs for 100 quanta,
       then each low priority process gets one quantum, then repeat
  Might alternatively have scheduled P0, P1, P0, P2, P0, P3, P0, P4, ...
  Is basic 100q/1q/1q/... good or bad?  (take a vote)
    Good:  If CPU bound workload, fewer contect switches
              => fewer cache misses, TBL misses, page faults, etc.
    Bad:  If you care about response time, P0 has to wait 100 quanta
  Do we care about response time?  Depends.  When do we care?
    Interactive work loads--100 quanta too long to wait for keys to echo
    Network protocols--want to overlap latencies
      Web server might not need a lot of CPU time,
        but we want to issue disk requests as soon as we get network requests
    Multimedia applications--e.g., keep constant video frame rate

So how to give high-priority processes a low response time
  Hierarchical stride scheduling
    [picture of tree, see how quanta get allocated]
  So now absolute error is O(log n_c), not O(n_c)

What are we seeing in Figure 13?
  8 processes, ticket ratios 7:1:1:1:1:1:1:1
  How often do low priority processes get schedules (every 14 quanta)
  Graph shows waiting time for high-priority process
    Hierarchical is better, as we would expect
  How would Lottery scheduling look on this experiment?
    Low-priority processes would have more variance
    High-priority process probably much better than non-hierarchical stride

Can this be extended to other resources?
  Yes, they discuss application to networking
  What about VM? as useful? (Every page equivalent, not every quantum)

Consider the following experiment:
  Process A reads random blocks from file F_A (that doesn't fit in cache)
  Process B reads random blocks from file F_B (that doesn't fit in cache)
  Running A or B requires 10 seconds. Running A then B requires 20 seconds.
  How long to run A & B concurrently?
    Probably much longer than 20 seconds
    File system groups a file's blocks together.  Two files => longer seeks
  What if A has 10 tickets and B has 1 ticket, what are relative allocations?
    Probably 1:1, NOT 10:1
    Processes are I/O bound, not CPU bound
    read, sleep, switch, read, sleep, switch, ...
  What is desired behavior.  Any way to make this work better?

Discussion:  Is stride scheduling a good idea?