UBM
===

What kind of paper is this?
  - Investigates topic that has been researched for decades.
     - Assumes lots of background and is therefore harder to understand
     - Reader should expect detailed evaluation

Explain figure 1
  - What are we seeing in Figure 1a?
  - Is this an interesting reference pattern?
  - What is IRG
  - In 1b, does graph just expand to the right with more blocks?  Almost
      Why?  Larger IRG means block needed farther in future
      But depending on access patterns, larger IGR blocks may be cacheable
        E.g., first reference only low-IRG blocks, then only high-IRG
	See Fig 1b, 200 cache blocks vs. 400
  - In 1c, LRU extends less far to the right.  Why?
       what is the problem? blocks don't fit.
       - how many blocks are referenced? ~5500
       - what is the maximum number of blocks in cache? 1000
       ==> memory size contains only ~20% of the working set
       - is the LRU distribution surprising for 1,000 blocks?
	    no.  at 25 the IRG is 25*50 = 5,000 block reference interval
       - how large does the cache have to be before LRU will do as well as opt?
	    Would need to hold longest loop in memory
	    In Fig 1a, looks like 3000 blocks + 2000 reference interval
  - Would FRB/LRU-k/2Q/EELRU do better?
      yes, but each doesn't deal with sequential and/or loop references
  - given the data presented is this a serious problem to fix?
       not clear.  perhaps the machine is just too overloaded.

What is the proposed solution?  UBM
  1. Detection: classifies references (sequential, looping, and other)
  2. Replacement: one scheme per reference class
  3. Allocation: partitions cache memory between the 3 categories
 
- Detection:
  - UBM keeps an on-line history of references (fileID, block ref)
  - on a new ref:
    - classify first references as other
    - if an earlier references references the previous block, mark
        this references and earlier as sequential
    - if in sequential mode, mark end when reference isn't for next block
    - if an earlier sequential reference is referenced, mark start of loop
      if in loop mode, mark end of loop if references isn't part of
      earlier sequential reference pattern
    - when marking end, add table entry (see figure 4)

- Replacement    
 - on block miss:
   - if free memory available
       read block in
   - if no free memory:
     - Sequential partition: MRU
     - Looping partition
       replace longest loop (sort of LRU)
       within in loop: use MRU
     - Other partition:
       LRU

- Allocator:
  - What is marginal gain?
    expected number of extra hits/time by adding one more buffer
    - sequential: 0
    - loop: 1/(p_{i_max}+1)
    - other:
        Why not calculate directly?
	    Too expensive--need LRU for all queue sizes
        approximation of Belady
        Belady:  hit(n) = 1 - c*n^{-k}
        Estimate n and k from two points (keep ghost buffers)
 
- Hypothesis:
  1. does it matter?
  2. how accurate is the detection?
  3. how good is replacement vs. OPT?
  4. how good is the allocator vs. OPT?
  5. are there simpler schemes possible that work well?
  6. runtime overhead for UBM?

- Environment:
  - apps with different patterns
    why does glimpse have loops?
  - trace-driven evaluation  (what are the problems with traces?)
  - FreeBSD implementation
  - sum of the memory requirements of apps is much higher than
    available memory
  - a number of apps make single pass through lots of memory (mpeg).
  - 3 traces

- Results
  Re 1. yes.
  Re 2. good.
  Re 3. close to OPT
  Re 4. don't know.  but it seems effective
  Re 5. doesn't appear so (unless programmer specifies strategy)
  Re 6. 5% vs. LRU

Explain figure 12 and table 2
  Why does UBM do better (i.e. improve more) with cold cache?
  When would you actually have a "cold cache" as measured here?
    After rebooting, unmounting file system, maybe deleting a file...
    Otherwise, cache will be cold but full of garbage
  How might you improve UBM in cold case?
    - Predict that sequential references might turn into loops
    - Keep ghost buffers in SEQ pool to measure how useful blocks
           there would have been
    - How would you adjust model
       marginal gain is actually > 0

Problems
  - Unfair
  - Indexed data structures (B-tree)