Usability: * change bootstrap mechanism: generate pstate image instead of using embedbins - removes uinit from kernel - use same process to install HiStar onto a disk - demand-page initial binaries from install CD/USB drive * clean up i386: gcc -m32 by default * performance: shared library prelinking * standardize on a reasonable FS layout * linux compat mode: run vmlinux instead of our unix emulation library - catch pagefault, syscall traps, vector them to the vmlinux idt - for sharing, implement "histarfs" in linux that uses our existing for a labeled file system. * multi-referenced containers - what happens to parent pointers? disable for multi-refable containers.. - depth count, with inf value? not so composable, but decent. * bootstrapping - disk device object: + disk_create(int index) + disk_identify(struct disk_id) -> bus, device#, manuf, serial, size + disk_{read,write}_buf() -> read or write sector buffer for disk device + disk_cmd(uint64_t offset) -> transfer data between disk and sector buffer + disk_status() -> command active/completed - pstate=bootstrap enables disk devices, does not write pstate to disk. + initialize partition table, write a kernel-only boot file to disk - sys_pstate_enable() disables disk device access, writes pstate to disk. + writes the user-level persistent state to disk for later use. * dstar-style labels: split information flow label from privilege label (stars and clearance). optimize labels as hashed sets of 64-bit values. * privilege objects: effectively a kobj_label, with one information flow label protecting the privilege, and a privilege label storing privilege. extend cobj_ref to hold a container entry of the privilege label to use, or 0 for a thread's local privilege. * process/page table objects: share page table based on a common privilege object. need to also agree on the information flow label used to fill in the page table. page table has container entry of address space object, which also specifies the privilege object used to fill in the page table. * optimize container entry lookups by hashing containers? * multi-core support: scheduling, concurrency control at object level in kernel, locality mechanisms HW support: * AHCI performance problems * e1000 autoneg -- 10mbps works, 100mbps does not? Compatibility issues: * implement libc-tsd for correct multi-threaded libc-rpc * MT-safe c++ support (lib/cppsup/bits/gthr.h) Performance: * deal with fragmentation on-disk * partial segment swapin and swapout * page reclamation, LRU (unpin callbacks for reclaiming space?) * update lwip (http://pdos.csail.mit.edu/~sbw/lwip/) Kernel bugs: * reuse flushed but not committed blocks in the on-disk log for the same disk address when writing pages to the log; otherwise a page can be written to the log multiple times in a single transaction, running the system out of log space. need to track commits in the log code to do this (currently not tracked). * bugs in handling out-of-memory conditions [low priority for now]: - freelist/dstack: free_later list using a btree - implement log.disk_map using a btree - larger FRM, stick FRM contents in a separate disk block * do something about uint128_t on i386 -- it's not there natively * kobject_get_page could fail when out of memory with copy-on-write; get rid of assertions that it succeeds (e.g. saving FP state -- get the FP page before running the kernel thread, if not enough memory then suspend the thread until it can allocate memory.) * do the right things happen with th_as caching? thread lowers label? * page_info::pi_parent is no longer necessary; use pi_plist instead Security: * netd_server doesn't do any checking on who's asking for what fd. socket() should grant a handle at * to the caller that will let them later do other socket operations. * taint_ct in gate_call/gate_return should be labeled with gate-call handles. * library call to run with a subset of your privileges, for doing some operation on behalf of the caller (e.g. in the declassifier gate). * should be more careful in dealing with segments passed in via gates; namely, need to avoid page faults, and ensure that we access them only with caller's privileges (verify label). can arise for two reasons: first, we map the segment into our AS but don't have the privilege to read or write to it. this can be fixed by segment_map() checking the label. second, the other side can drop the segment and the service (other side of gate) will fault because the segment is gone. for this, need either to copy all segments first, or to have a careful-read and careful-write wrapper that does setjmp() first, and the segfault handler will longjmp(). * force checkpoint every N handle allocations, and roll forward handle counter by N after restore, for some large value N. * real randomness... * system calls should be atomic (all-or-nothing), otherwise we have a covert storage channel when you can tell a system call executed partway and suspended when it hit something. the culprits are variable-length data structures -- AS and labels -- sys_as_get, sys_as_set, and anything that calls label_to_ulabel: sys_obj_get_label, sys_self_get_{clearance,verify}, sys_gate_clearance. * probably should scrub state more often across gate calls: gate_call_data passed to gate_call::call() and gatesrv_return::ret(), and TLS stack (mask utraps before doing this, to avoid race conditions). look at lib/authclnt.cc. * what happens when gate caller doesn't grant all of the gate's own privileges? maybe change sys_gate_enter: caller has to grant all stars that the gate has? a lot of code already assumes this, and would have to check whether we lost any stars on every gate entry in user-space.. is it expensive to do this in user-space? * you can provide an arbitrary gate for a gate-based service to return to, and gatesrv does no checks to make sure the caller could have really invoked it in the first place. should fetch the return gate's verify label VL_RG and check that VL_T \leq VL_RG. * threads can detect reboot using SIDT, SGDT, SLDT, STR? place any such global structs at deterministic memory locations in kernel. * can use CPUID, %eflags[ID] to detect hardware changes * should disable RDTSC from user-level, except we use it for profiling now.. what to do about high-precision time in general? * extend the scheduling quantum when we take a hardware interrupt? to allow ssl_eprocd to finish its RSA operation without preemption. * does rflags.RF allow user-space to detect kernel faults? this bit is always cleared by IRET. Cleanup: * better separation between kernel, user-space and shared header files * user-space resource allocators -- i.e. running without CT_QUOTA_INF * (?) expose kobj_label's at the syscall API to get label sharing * lwip: fix sys_arch_timeouts() in lib/lwip/jos64/arch/sys_arch.c; need a callback for when threads enter/leave an AS. * sys_container_get_parent(): check thread can read parent (for later chown) * potentially unify spawn & gate invocation. spawn could be split into two parts: (1) create new address space and a gate vectoring into _start, and (2) invoke said gate with whatever privileges, arguments, etc that we want. * provide resource-clean abstractions for starting a thread on TLS, and jumping through a gate without leaving any resources behind. * enable __UCLIBC_DYNAMIC_ATEXIT__; the problem with it is that global objects get registered from _init, at which point there isn't a start_env, which makes dynamic allocation difficult. Bugs: * netd bug (fetch'ing files hangs for some files but not others) * look for concurrency bugs using test harness (make disk ops async) * netd connect() does not honor O_NONBLOCK Larger issues: * system management: kernel upgrades? system install? partitioning a disk? * backup/restore * system management in a fully-persistent world: - restarting services with handles and gates - upgrades (signing binaries => integrity handles?) * expose CPU resource allocation: track resources in container entries and re-compute a thread's th_sched_tickets when it changes its label to only include resources from containers it can read (set_sched_parents). associate scheduling type (time-based, instruction-based) and cache clearing flag with container. * Linux in histar userspace to replace LWIP with Linux's TCP stack. * port SANE switches to HiStar (Ed) * "decentralized SANE" to reduce covert channels in distributed HiStar * trusted path for voting machines? port python to HiStar for voting software. New hardware support: * hyperthreading/multicore: use monitor/mwait or pause/spinwait for fast sys-/vmm-calls. * debug registers: use call tracing for kernel debugging? intel CPUs have 16-slot buffers it seems, while AMD traps on every jmp. * iommu: device drivers into user-mode. some more thoughts on hardcopy paper. not so easy due to overly-trusting PCI-PCI bridges. it may be OK to have one "device driver" protection domain for most PCI devices, except that the disk is a fully-trusted device. if we can ensure that bad PCI cards cannot take over the disk, then we may be OK. also handle NMI. * cpu virt: containerize SMM (do any desktops use SMM much? usb kbd virt?) * cpu virt: move parts of the current CPL=0 kernel into a VMM ("ring -1") and make the CPL=0 code less trusted. * cpu virt: with nested page tables, implement a "vmrun" system call that will allow us to run a VM in a HiStar thread. for devices, use qemu which was ported to do the same thing for Linux's KVM. * use ata "nv cache" (flash memory) feature set to speed up disk sync. Miscellaneous: === multiple threads that have different labels in the same AS is a little strange. when you want to give an fd to another thread, you have to grant it fd handles at *; similarly, when you close an fd, you should make sure no other thread has those handles at * (otherwise they'll never get GCed from the label). maybe it makes sense to have some sort of label shared by threads that goes along with an AS? for discretionary handles (at *) we could implement this at user-space, by using lib/privstore.cc to stash away/drop privilege. for handles at non-* levels, this might be a non-issue? possible answer: store all stars in a privstore; stars in thread label are a cache. how to catch -E_LABEL errors and how to fetch the right handle?